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In science things should be made 
as simple as possible. 

Albert Einstein 

All the great things are simple. 

Winston Chnrctiill 

Abstract 

Metric regularity theory lies in the very heart of variational analysis, a relatively 
new discipline whose appearance was to a large extent determined by needs of modern 
optimization theory in which such phenomena as non-differentiability and set-valued 
mappings naturally appear. The roots of the theory go back to such fundamental 
results of the classical analysis as the implicit function theorem, Sard theorem and 
some others. The paper offers a survey of the state-of-the-art of some principal parts 
of the theory along with a variety of its applications in analysis and optimization. 
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Introduction 

Metric regularity has emerged during last 2-3 decades as one of the central concepts of a 
young discipline now often called variational analysis. The roots of this concept go back 
to a circle of fundamental regularity ideas of classical analysis embodied in such results 
as the implicit function theorem, Banach open mapping theorem, theorems of Lyusternik 
and Graves, on the one hand, and the Sard theorem and the Thom-Smale transversality 
theory, on the other. 

Smoothness is the key property of the objects to which the classical results are applied. 
Variational analysis, on the other hand, appeals to objects that may lack this property: 
functions and maps that are non-differentiable at points of interest, set-valued m^pings 
etc.. Such phenomena naturally appear in optimization theory and not only therqj. 

In the traditional nonlinear analysis, regularity of a mapping (e.g. from a normed space 
or a manifold to another) at a certain point means that its derivative at the point is onto 
(the target space or the tangent space of the target manifold). This property, translated 
through available analytic or topological means to corresponding local properties of the 
mapping, plays a crucial role in studying some basic problems of analysis such as existence 
and behavior of solutions of a nonlinear equation F{x) = y (with F and y viewed as data 
and X as unknown) under small perturbations of the data. Similar problems appear if, 
instead of equation, we consider inclusion 

y G F{x) (0.1) 

(with F a set-valued mapping this time) which, in essence, is the main object to study 
in variational analysis. The challenge here is evident: no clear way to approximate the 
mapping by simple objects like linear operators in the classical case. 

The key step in the answer to the challenge was connected with the understanding 
of the metric nature of some basic phenomena that appear in the classical theory. This 
eventually led to the choice of the class of metric spaces as the main playground and sub¬ 
sequently to abandoning approximation as the primary tool of analysis in favor of a direct 
study of the phenomena as such. The ’’metric theory” offers a rich collection of results that, 
being fairly general and stated in purely metric language, are nonetheless easily adaptable 
to Banach and finite dimensional settings (still among the most important in applications) 
and to various classes of mappings with special structure. Moreover, however surprising 
this may sound, the techniques coming from the metric theory sometimes appear more 
efficient, flexible and easy to use than the available Banach space techniques (associated 
with subdifferentials and coderivatives, especially in infinite dimensional Banach spaces). 

^Grothendick mentions ’’ubiquity of stratifed structures in practically all domains of geometry” in 
liisl984 Esquisse d’un Programme, see m 
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We shall not once see that proper use of metric criteria may lead to dramatic simplifi¬ 
cation of proofs and clarification of the ideas behind them. This occurs at all levels of 
generality, from results valid in arbitrary metric spaces to specific facts about even fairly 
simple classes of finite dimensional mappings. 

It should be added furthermore that the central role played by distance estimates 
has determined a quantitative character of the theory (contrary to the predominantly 
qualitative character of the classical theory). Altogether, this opens gates to a number of 
new applications, such as say metric fixed point theory, differential inclusions, all chapters 
of optimization theory, numerical methods. 

This paper has appeared as a result of two short courses I gave in the University of 
Newcastle and the University of Chile in 2013-2014. The goal was to give a brief account 
of some major principles of the theory of metric regularity along with the impression of 
how they work in various areas of analysis and optimization. The three principal themes 
that will be in the focus of attention are: 

(a) regularity criteria (containing quantitative estimates for rates of regularity) includ¬ 
ing formal comparisons of their relative power and precision; 

(b) stability problems relating to the effect of perturbations of the mapping on its 
regularity properties, on the one hand, and to solutions of equations, inclusions etc. on 
the other; 

(c) role of metric regularity in analysis and optimization. 

The existing regularity theory of variational analysis may look very technical. Many 
available proofs take a lot of space and use heavy techniques. But the ideas behind most 
basic results, especially in the metric theory, are rather simple and in many cases proper 
application of the ideas leads to noticeable (occasionally even dramatic) simplihcation 
and clarification of the proofs. This is a survey paper, so many results are quoted and 
discussed, often without proofs. As a rule, a proof is given if (a) the result is of a primary 
importance and the proof is sufficiently simple, (b) the result is new, (c) the access to the 
original publication containing the result is not very easy and especially (d) the proof is 
simpler (shorter, or looking more transparent) than available in the literature known to 
me. 

And of course there are topics (some important) not touched upon in the paper, es¬ 
pecially those that can be found in monographic literature. I mean first of all the books 
by Dontchev and Rockafellar [55] and Klatte and Kummer [109] in which metric regular¬ 
ity, in particular its hnite dimensional chapter, is prominently presented. Among more 
specialized topics not touched upon in the survey, I would mention nonlinear regularity 
models, point subdifferential regularity criteria with associated compactness properties of 
subdifferentials and directional regularity. 

The survey consists of two parts. The first part called ‘Theory’ contains an account of 
the basic ideas and principles of the metric regularity theory, first in traditional settings of 
the classical analysis and then for arbitrary set-valued mappings between various classes 
of spaces. In the second part ‘Applications’ we show how the theory works for some 
specihc classes of maps that typically appear in variational analysis and and for a variety 
of fundamental existence, stability and optimization problems. In preparing this part 
of the survey the main efforts were focused on finding a productive balance between 
general principles and specific results and/or methods associated with the problem. This 
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declaration may look as a sort of truism but the point is that publications in which over¬ 
attachment to certain particular techniques of variational analysis (e.g. associated with 
generalized differentiation) leads to long and poorly digestible proofs of sufficiently simple 
and otherwise easily provable results is not an exceptional phenomenon. 

To conclude the introduction I wish to express my thanks to J. Borwein and A. Joffre 
for inviting me to give the lectures that were the basis for this paper and to J. Borwein 
especially for his suggestion to write the survey. I also wish to thank D. Drusvyatskij and 
A. Lewis for the years of cooperation and many fruitful discussions and to A. Kruger and 
D. Klatte for many helpful remarks. 

Dedication. 2015 and late 2014 have witnessed remarkable jubilees of six my good old 
friends. I dedicate this paper, with gratitude for the past and warm wishes for the future 
to 

Prof. Vladimir Lin Prof. Terry Rockafellar 

Prof. Louis Nirenberg Prof. Vladimir Tikhomirov 

Prof. Boris Polyak Prof. Nikita Vvedenskaya 

Notation. 

d{x,Q) - distance from x to Q; 

d{Q,P) = inf{||x — m|| : x G Q, u G P} - distance between Q and P; 

ex{Q,P) = sup{(i(x,P) : x G Q)} - excess of Q over P; 

h{Q,P) = max{ex((5, P), ex(P, Q)} - Hausdorff distance between Q and P; 

B{x,r) - closed ball of radius r and center at x; 

O 

B{x,r) - open ball of radius r and center at x; 

F\q - the restriction of a mapping F to the set Q; 

F : X ^ Y - set-valued mapping; 

Graph F = {{x,y) : y G P(a:)} - graph of F; 

I - the identity mapping (subscript, if present, indicates the space, e.g. Ix)', 
epi / = {(x,a) : a > /(x)} ~ epigraph of /; 
dom / = {x : /(x) < 00 } ~ domain of /; 

iQ(x) - indicator of Q (function equal to 0 on Q and -|-oo outside); 

[f <a] = {x : /(x) < a} etc.; 

X X Y - Cartesian product of spaces; 

X* - adjoint of A; 

(x*,x) - the value of x* on x (canonical bilinear form on X* x A); 

IR^ - the n-dimensional Euclidean space; 

B - the closed unit ball in a Banach space (sometimes indicated by a subscript, e.g. 
Bx is the unit ball in A); 

Sx - the unit sphere in A; 

Ker A - kernel of the (linear) operator A; 

L^ = {x* G X* : (x*,x} = 0, V X G L} - annihilator of a subspace L C X; 

K° = {x* G A* : (x*, x) < 0, V x G K} - the polar of a cone K C X 
Im A - image of the operator A; 

5(A) - collection of closed separable subspaces of A; 
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£(X, Y) - the space of linear bounded operators X ^ Y with the operator norm: 


||A|| = sup II^X 
I|3:||<1 


L 0 M - direct sum of subspaces; 

TxM, NxM - tangent and normal space to a manifold M at x G M; 
r(Q, x) - contingent cone to a set Q at x G Q; 

N{Q, x) - normal cone to Q at x G Q, often with a subscript (e.g Np is a Frechet 
normal cone etc.) 

We use the standard conventions d{x,it)) = oo; inf0 = oo; sup0 = —oo with one excep¬ 
tion: when we deal with non-negative quantities we set sup 0 = 0. 


Part 1. Theory 

1 Classical theory: five great theorems. 

In this section all spaces are Banach. 

1.1 Banach-Shauder open mapping theorem 

Theorem 1.1 f [T7l I161] i. Let A : X Y be a linear bounded operator onto Y, that is 
A{X) = Y. Then 0 G int A{B). 

The theorem means that there is a IF > 0 such that for any y &Y there is an x G X 
such that ^(x) = y and ||x|| < K\\y\\ (take as K the reciprocal of the radius of a ball in 
Y contained in the image of the unit ball in X under A). 

Definition 1.2 (Banach constant). Let A : X ^ Y he a. bounded linear operator. The 
quantity 

C{A) = sup{r > 0 : rBy C A{Bx)} = inf{||y|| : y 0 A{Bx)} 
will be called the Banach constant of A. 

The following simple proposition offers two more expressions for the Banach constant. 
Given a linear operator A : X ^ Y, we set 

= sup d{0,A~^{y)) = sup inf{||x|| : Ax = y}. 

Ily||<i Il3/ll=i 

Of course, if T is a linear homeomorphism, this coincides with the usual norm of the 
inverse operator. 

Proposition 1.3 (calculation of C{A)). For a bounded linear operator A : X ^ Y 

C(A)= inf PVII = P“'ir'. 

Ily*ll=l 
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1.2 Regular points of smooth maps. Theorems of Lyusternik and Graves. 

Let F : X — 7 - y be Frechet differentiable at x E X. It is said that F is regular at x if 
its derivative F'{x) is a linear operator onto Y. Let M G X he a smooth manifold. The 
tangent space TxM to M at x E M is the collection of h G X such that d{x + th, S) = o{t) 
when t —>■ + 0 . 

Theorem 1.4 (Lyusternik |126j l. Suppose that F is continuously differentiable and regular 
at X. Then the tangent space to the level set M = {x : F{x) = T(x)} at x coincides with 
Ker F'{x). 

Theorem 1.5 (Graves [73]). Let F be a continuous mapping from a neighborhood of 
X G X into Y. Suppose that there are a linear bounded operator A : X ^Y and positive 
numbers (5>0, 7 > 0 , e>0 such that C{A) > (i + 7 and 

||F(x') — F{x) — A{x' — x)|| < 5\\x' — x||, 

whenever x and x' belong to the open e-ball around x. Then 

B{F{x),jt) C F{B{x,t)) 


for all t G (0, e). 

Here is a slight modification (quantities explicitly added) of the original proof by 
Graves. 

Proof. We may harmlessly assume that F{x) = 0. Take K > 0 such that KC{A) > 1 > 
+ 7 ), and let ||y|| < 7 t for some t < e. Set xq = x, yo = y and define recursively Xn, 
yn as follows: 

Un—l — A(Xn ^n—l)j \\^n || ^ H||, y^i — A(Xn Xn—l) {^F(^Xn') F(^Xn—l')') ■ 

It is an easy matter to verify that 

||xn - x„_i|| < {Kdr-^K\\y\\, \\yn\\ < {KSTM 

and Vn-i — Dn = F{xn) — F{xn-i), so that (x^) converges to some x such that F{x) = y 
and 

K 

\\x-^\ < i_j^^ \\y\\ ^7"iy|| <t 

as claimed. □ 

The theorem of Lyusternik was proved in 1934 and the theorem of Graves in 1950. 
Graves was apparently unaware of Lyusternik’s result and Lyusternik, in turn, of the open 
mapping theorem by Banach-Shauder. Nonetheless the methods they used in their proves 
were very similar. For that reason the following statement which is somewhat weaker than 
the theorem of Graves and somewhat stronger than the theorem of Lyusternik is usually 
called the Lyusternik-Graves theorem. 
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Theorem 1.6 (Lyusternik-Graves theorem). Assume that F : X ^ Y is continuously 
differentiable and regular at x. Then for any positive r < C{F'(x)), there is an e > 0 such 
that 

B{F{x),rt) C F{B{x,t)), 
whenever ||x — x|| < e, 0 < t < e. 

It should be also emphasized that no differentiability assumption is made in the theo¬ 
rem of Graves. In this respect Graves was much ahead of time. Observe that the mapping 
F in the theorem of Graves can be viewed as a perturbation of ^ by a (5-Lipschitz mapping. 
With this interpretation the theorem of Graves can be also viewed as a direct predecessor 
of Milyutin’s perturbation theorem iTheorem 14.21 in the fourth section), which is one of 
the central results in the regularity theory of variational analysis. 

1.3 Inverse and implicit fnnction theorem 

Theorem 1.7 (Inverse function theorem). Suppose that F is continuously differentiable 
at X and the derivative F'ix) is an invertible operator onto Y. Then there is a mapping 
G into X defined in a neighborhood ofy = F(x), strictly differentiable at y and such that 

G'{y) = (F'(x))~^ and F o G = ly 


in the neighborhood. 

The shortest among standard proofs of the theorem is based on the contraction map¬ 
ping principle (see e.g. the second proof of the theorem in [55]). But equally short proof 
follows from the theorem of Lyusternik-Graves. 

Proof. SetA = F'{x). Then F(x') — T(x) — ^(x' — x) = r(x', x)||x' — x||, where ||r(x', x)|| —)• 
0 when x,x' —>■ x. As A is invertible, there is a AT > 0 such that \\Ah\\ > iL||/i||. Hence 
||F(x') — T(x)|| > {K — r(x,x'))||x' — x|| > 0 if x, x' are close to x. This means that F is 
one-to-one in a neighborhood of x. But by the Lyusternik-Graves theorem, F{U) covers a 
certain open neighborhood of y. Hence G = F~^ is defined in a neighborhood of F{x). So 
given y and y' close to y = F(x) and let x', x be such that F{x') = y', F{x) = y. Then 
as we have seen ||y — y'\\ > K\\x — x'||. We have 

A-i {Fix') - F{x) - Aix' - x)) = A-\y' - y) - G{y') - G(y), 

so that 

\\G{y')-G{y)-A-\y' -y)\\ < \\A\\-^F{x') - F{x) - A{x'- x)\\ 

= P“^||||r(x',x)||||x' - x|||| < q{y,y')\\y' - y\\, 

where q{y, y') = Kr{G{y), G{y')) obviously goes to zero when y, y' ^y. □ 

Theorem 1.8 (implicit function theorem). Let X, Y, Z be Banach spaces, and let F 
be a mapping into Z which is defined in a neighborhood of {x,y) & X x Y and strietly 
differentiable at (x, y). Suppose further that the partial derivative Fy{x,y) is an invertible 
operator. Then there are neighborhoods U C X ofx and W C Z of z = F{x,y) and a 


mapping S : U x W ^ Y such that {x,z) {x, S{x, z)) is a homeomorphism ofUxW 

onto a neighborhood of {x, y) in X xY and 

F{x, S{x, z)) = z, 'ixGlI, '^zGW 

The mapping S is strictly differentiable at {x, z) with 

S:,{x,z) = {Fy{x,y)y^, S^{x,z) = {Fy{x,y))~^F^(x,y). (1.1) 

The simplest proof of the theorem is obtained by application of the inverse mapping 
theorem to the following map X x Y ^ X x Z (see e.g. [55]): 

( Fix,y) )• 

1.4 Sard theorem. Transversality. 

Definition 1.9 (critical and regular value). Let X and Y be Banach spaces, and let F 
be a mapping into Y defined and continuously differentiable on an open set of [/ G X. A 
vector y gY is called a critical value of F if there is an x G C/ such that F{x) = y and x is 
a singular point of F. Any point in the range space which is not a critical value is called 
a regular value, even if it does not belong to Im F. Thus y is a regular value if either 
y 7 ^ F{x) for any x of the domain of F or Im F'{x) = Y for every x such that F{x) = y. 

Theorem 1.10 (Sard |160j i. Let 12 be an open set in IBF and F a -mapping from Q 
into IR^. Then the Lebesgue measure of the set of critical values of F is equal to zero, 
provided k > n — m + 1. 

For a proof of a ’’full” Sard theorem see [T]; a much shorter proof for functions can 
be found in |137] . 

Definition 1.11 (transversality). Let F : A —>■ T be a C^-mapping, and let M C T be 
a C^-submanifold. Let finally x be in the domain of F. We say that F is transversal to 
M at X if either y = F{x) ^ M or y G M and Im F'{x) + TyM = Y. It is said that F is 
transversal to M: F (h M, if it is transversal to M at every x of the domain of F. 

We can also speak about transversality of two manifolds Mi in M 2 in X: Mi iti M 2 at 
X G Ml n M 2 if TxMi + TxM 2 = X. For our future discussions, it is useful to have in mind 
that the latter property can be equivalently expressed in dual terms: NxMiGNxM 2 = {0}, 
where N^M C X* is the normal space to M at x, that is the annihilator of TxM. 

A connection with regularity is immediate from the definition: if (L, ip) is a local 
parametrization for M at y and y = F{x), then transversality of T to M at x is equivalent 
to regularity at (x, 0,0) of the mapping $ : A x L —)• T given by ^{u, v) = F{u) — (p{v). 

The connection of transversality and regularity is actually much deeper. Let P be also 
a Banach space and let T : X x P ^ Y. We can view T as a family of mappings from A 
into Y parameterized by elements of P. Let us denote “individual” mappings x -G F{x,p) 
by F(-,p). Let further M C T be a submanifold, and let vr : A x P —P be the standard 
Cartesian projection {x,p) p- 
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Proposition 1.12. Suppose F is transversal to M and Q = is a manifold. Let 

finally ttIq stands for the restriction of n to Q. Then F{-,p) is transversal to M, provided 
p is a regular value of tt\q. 

Combining the proposition with the Sard theorem, we get the following (simple version 
of) transversality theorem of Thom 

Theorem 1.13 (see e.g. [76]). Let X, Y and P be finite dimensional Banach spaces Let 
M <zY be a -manifold, and let F : X x P ^ Y be a -mapping {k < r). Assume that 
F iti M and k > dimX — codimM. Then F{-,p) iti M for each p G P outside of a subset 
of P with dim P-Lebesgue measure zero. 

2 Metric theory. Definitions and equivalences. 

Here X and Y are metric space. We use the same notation for the metrics in both hoping 
this would not lead to any difficulties. 

2.1 Local regularity 

We start with the simplest and the most popular case of local regularity near a certain 
point of the graph. So let an F : X V be given as well as a (x, y) E Graph F. 

Definition 2.1 (local regularity properties). We say that F is 

• open or covering at a linear rate near {x, y) if there are r > 0, e > 0 such that 

B{y, rt) n Bify, e) C F{B{x, t)), V (x, y) E Graph F, d{x,x) < e, t >0. 

The upper bound surF(x|y) of such r is the modulus or rate of surjection of F near (x, y). 
If no such r, e exist, we set surF(x|y) = 0; 

• metrically regular near (x, y) E Graph F if there are F > 0, e > 0 such that 

d{x,F~^{y)) < Kd{y,F{x)), if d(x,x) < e, d{y,y) < e. 

The lower bound regF(x|y) of such K is the modulus or rate of metric regularity of F near 
(x,y). If no such K, e exist, we set regF(x|y) = oo. 

• pseudo-Lipschitz or has the Aubin property near (x,y) if there are K > 0 and e > 0 
such that 


d{y,F{x)) < Kd{x,u), if d{x,x) < e, d{y,y) < e, y E F(n). 

The lower bound lipF(x|y) is the Lipschitz modulus or rate of F near (x, y). If no such K, 
e exist, we set lipF(x|y) = oo. 

Note a difference between the covering property and the conclusions of theorems of 
Lyusternik and Graves: the theorems deal only with the given argument x while in the 
definition we speak about all x E dom F close to x. This difference that was once a 
subject of heated discussions is in fact illusory as under the assumptions of the theorems of 
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Lyusternik and Graves the covering property in the sense of the just introduced definitions 
is automatically satisfied. 

The key and truly remarkable fact for the theory is that the three parts of the def¬ 
inition actually speak about the same phenomenon. Namely the following holds true 
unconditionally for any set-valued mapping between two metric spaces. 

Proposition 2.2 (local equivalence). F is open at a linear rate near {x,y) E Graph F 
if and only if it is metrically regular near {x, y) and if and only if F~^ has the Aubin 
property near {y,x). Moreover, under the convention that 0 • oo = 1, 

surF(x|y) • vegF{x\y) = 1; regF{x\y) = lipF“^(y|x). 

Remark 2.3. In view of the proposition it makes sense to use the word regular to char¬ 
acterize the three properties. This terminology would also emphasize the ties with the 
classical regularity concept. We observe further that while the rates of regularity are con¬ 
nected with specific distances in X and Y, the very fact that F is regular near certain point 
is independent of the choice of specific metrics. Thus, although the definitions explicitly 
use metrics the regularity is a topological property. 

The proof of the proposition is fairly simple (we shall get it as a consequence of a more 
general equivalence theorem later in this section). But the way to it was surprisingly long 
(see brief bibliographic comments at the end of the section). 

There are other equivalent formulations of the properties. For instance, the definition 
of linear openness/ covering can be modified by adding the constraint 0 <t < £ (see [92]); 
a well known modification of the definition of metric regularity includes the condition that 
d{y,F(x)) < £. The only difference is that the e’s in the original and modified definitions 
may be different. 

Definition 2.4 (graph regularity [164] ). F is said to be graph-regular at (or near) {x,y) E 
Graph F if there are K > 0, e > 0 such that the inequality 

d{x, F~^{y)) < Kd{{x, y), Graph F), (2.1) 

holds, provided d{x,x) < e, d{y,y) < e. 

Proposition 2.5 (metric regularity vs graph regularity [164] ). Let F : X ^ Y, and 
{x,y) E Graph F. Then F is metrically regular at {x,y) if and only if it is graph-regular 
at {x,y). 

Note that, unlike the equivalence theorem, the last proposition is purely local: the 
straightforward non-local extension of this result (e.g. along the lines of the subsection 
below) is wrong. 

2.2 Non-local regularity. 

As we have already mentioned, most of current researches focus on local regularity, (al¬ 
though the hrst abstract definition of the covering property given in [45] was absolutely 
non-local). To a large extent this is because of the close connection of modern variational 
analysis studies with optimization theory which is basically interested in local results: 
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optimality conditions, stability of solutions under small perturbations, etc. Another less 
visible reason is that non-local regularity is a more delicate concept: in the non-local case 
we cannot freely change the regularity domain that is an integral part of the definition. 
Meanwhile non-local regularity is, a powerful instrument for proving e.g. various existence 
theorems (see e.g. subsection 8.7). 

Let U C X and V <Z Y (we usually assume U and V open), let F : X ^ Y, and 
let 7 (-) and (5(-) be extended-real-valued functions on X and Y assuming positive values 
(possibly inhnite) respectively on U and V. 

Definition 2.6 (non-local regularity properties [92]). We say that F is 

• 'j-open (or ^-covering) at a linear rate on t/ x 1 / if there is an r > 0 such that 

B{F{x), rt)n V C F{B{x,t)), 

if X G 17 and t < 'y(x). Denote by sur.yF(t/|i7) the upper bound of such r. If no such r 
exists, set sur.^F(t/|D) = 0. We shall call sur..j,F(17|D) the modulus (or rate) of ^-openness 
of F on 17 X 17; 

• 'y-metrically regular on U x V if there is a i7 > 0 such that 

d{x,F-^{y)) < Kd{y,F{x)), 

provided x € U, y GV and Kd{y,F{x)) < 'y(x). Denote by ieg^F{U\V) the lower bound 
of such K. If no such K exists, set reg^F = oo. We shall call iceg^F{U\V) the modulus 
(or rate) of 'j-metric regularity of F on 1/ x 17; 

• 5-pseudo-Lipschitz on 1/ x 17 if there is a FT > 0 such that 

d{y,F{x)) < Kd{x,u) 

it X G U, y G V, Kd{x,u) < 6{y) and y G F{u). Denote by lip^F(l/|I7) the lower bound 
of such K. If no such K exists, set lip^F = oo . We shall call lip 5 F(t/|I 7 ) the 6-Lipschitz 
modulus of F on 1/ X 17. 

ItU = X and 17 = F, let us agree to write sur..j,F, reg^F, lip^F instead of sur..j,F(X|F), 
etc. The role of the functions 7 and <5 is clear from the dehnitions. They determine how far 
we shall reach from any given point in verification of the dehned properties. It is therefore 
natural to call them regularity horizon functions. Such functions are inessential for local 
regularity (see e.g. Exercise 2.8 below). But for hxed U and 17 regularity horizon function 
is an essential element of the dehnition. Regularity properties corresponding to different 
7 may not be equivalent (see Example 2.2 in |97] and also Exercise 2.8 below). 

Theorem 2.7 (equivalence theorem). The following three properties are equivalent for 
any pair of metric spaces X, Y, any F : X ^ Y, any U C X and 17 C F and any 
(extended-real-valued) function 7 (x) which is positive on U: 

(a) F is 'j-open at a linear rate on U x V; 

(h) F is ^-metrically regular on U x V; 

(c) F~^ is 'y-pseudo-Lipschitz onVxU. 

Moreover (under the convention that 0 • 00 = Ij 

sm-yF{U\V) ■ Teg^F{U\V) = 1, Teg^F{U\V) = lip^F-\V\U). 
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Proof. The implication (b) ^ (c) is trivial. Hence lip^F~^{V\U) < veg.yF{U\V). To prove 
that (c) ^ (a), take a K > lip.^T“^ and an r < K~^ , let t < 7(x), and let x G C/, y G V, 
V G F{x) and y G B{v,tr). Then d{y,v) < r'y{x) and by (c) d{x, F~^(y)) < Kd{y,v) < 
r~^d{y, v) < t. It follows that there is a n such that y G F{u) and d{x, u) < t. Hence 
y G F{B{x,t)). It follows that r < suv^F, or equivalently 1 < Kswc^F. But r can be 
chosen arbitrarily close to K~^ and and K can be chosen arbitrarily close to lip..^T“^. So 
we conclude that sur.yF • lip..j,T“^ > 1. 

Let hnally (a) hold with some r > 0, let x G H, y G V, and let d{y,F{x)) < 7(x). 
Choose a u G F{x) such that d{y,v) < r'y{x) and set t = d{y,v)/r. By (a) there is a 
u G F~^{y) such that d{x,u) < t. Thus d{x,F~^{y)) < t = d{y,v)/r. But d{y,v) can 
be chosen arbitrarily close to d{y,F{x)) and we get d{x,F~^{y)) < r~^d{y, F{x)), that is 
r • i:eg.yF < 1. On the other hand r can be chosen arbitrarily close to sur.yT and we can 
conclude that sui^F ■ leg^^F < 1 so that 

1 > smyF{U\V) • reg.^F(H|H) > sur.^F(H|H) • lip.^F(H|[/) > I 
which completes the proof of the theorem. □ 

The most important example of the horizon function is m(x) = d{x, X\U). The 
meaning is that we need not look at points beyond U. We shall call F Milyutin regular on 
U X H if it is m-regular. (This is actually the type of regularity implicit in the definition 
given in [l5].) In what follows we shall deal only with Milyutin regularity when speaking 
about non-local matters. 

Exercise 2.8. Prove that F is regular near (x, y) G Graph F if and only if it is Milyutin 

O O 

regular on B{x,s) x B{y,£) for all sufficiently small e. 

We conclude the section with a useful result (a slight modification of the corresponding 
result in [ 88 ]) showing that, as far as metric regularity is concerned, any set-valued mapping 
can be equivalently and in a canonical way replaces by a single-valued mapping continuous 
on its domain. 

Proposition 2.9 (single-valued reduction). Let X x Y be endowed with the f^-metric. 
Let F be Milyutin regular on U x V with sur:mF{U\V) > r > 0. Consider the mapping 
Vp ■ Graph F ^ Y which is the restrietion to Graph F of the Cartesian projeetion 
(x,y) y. 

Then Vp is Milyutin regular on {U x Y) x V and sui[mFp{U x Y\V) = sni[rnF{U\V) 
if X X Y is considered e.g. with the ^-metric. 

A few bibliographic comments. To begin with, it is worth mentioning that in the clas¬ 
sical theory no interest to metric estimates can be traced. The covering property close to 
the covering part of Milyutin regularity was introduced in |45] and attributed to Milyutin. 
An estimate of metric regularity type first time appeared in Lyusternik’s paper [126] but 
for X restricted to the kernel of the derivative. In loffe-Tikhomirov [103] metric regularity 
was proved under the assumptions of the Graves theorem. Robinson was probably the 
first to consider set-valued mappings. In [150] he proved metric regularity of the mapping 
F{x) = f{x) -I- K (even of the restriction of this mapping to a convex closed subset of X), 
under the assumptions that f : X ^ Y is continuously differentiable and K dY is a closed 
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convex cone, under certain qualification condition extending Lyusternik’s Im F'{x) = Y. 
The definition of 7 -regularity was given in [92]. 

Equivalence of covering and metric regularity was explicitly mentioned (without proof) 
in the paper of Dmitruk-Milyutin-Osmolovski |l5] that marked the beginning of systematic 
study of the regularity phenomena, in particular in metric spaces, and Ioffe in |52| stated 
a certain equivalence result (Proposition 11.12 - see |87] for its proof) which, as was 
much later understood, contains even more precise information about the connection of 
the covering and metric regularity properties. And the pseudo-Lipschitz property was 
introduced by Aubin in |^. 

This was the sequence of events prior to the proof of the equivalence of the three 
properties by Borwein-Zhuang [26] and Penot [142] . It has to be mentioned that in both 
papers more general ’’nonlinear” properties were considered. In this connection we also 
mention the paper by Frankowska [67] with a short proof of nonlinear openness and some 
pseudo-Holder property. 

3 Metric theory. Regularity criteria. 

This section is central. Here we prove necessary and sufficient conditions for regularity. 
The key results is Theorems 13.1113.21 and 13.31 containing general regularity criteria. The 
criteria (especially the first of them) will serve us as a basis for obtaining various qualitative 
and quantitative characterizations of regularity in this and subsequent sections. The 
criteria are very simple to prove and, at the same time, provide us with an instrument of 
analysis which is both powerful and easy to use. We shall see this already in this section 
and many times in what follows. In the second subsection we consider infinitesimal criteria 
for local regularity based on the concept of slope, the central in the local theory. 

Given a set-valued mapping F : X ^ Y, we associate with it the following functions 
that will be systematically used in connection with the criteria and their applications: 

Vy{^>>’) = { S)*’ ‘olherCL? ■ = <*(9.fW); = Immf^(u). 

Note that ipy is Lipschitz continuous on Graph F, hence it is lower semicontinuous when¬ 
ever Graph T is a closed set. 

3.1 General criteria. 

Given a ^ > 0, we define the ^-metric on X x T by 

y), (x', y')) = max{d{x, x'),^d{y, y')}. 

Theorem 3.1 (criterion for Milyutin regularity). Let U <Z X and V <Z Y be open sets, 
and let F : X ^Y be a set-valued mapping whose graph is complete in the product metric. 
Let further r > 0 and there be a > 0 such that for any xGU,y£V,vG F{x) with 
0 < d{y,v) < rm{x), there is a pair {u,w) E Graph F different from {x,v) and such that 

d{y, w) < d{y, v) - rd^{{x, v), {u, w)). (3.1) 
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Then F is Milyutin regular on U x V with suT:rnF'{U\V) > r. 

Conversely, if F is Milyutin regular onU xV, then for any positive r < sur^F(?7|y), 
any ^ G (0,r“^), any x £ U, v £ F{x) and y £V satisfying 0 < d{y,v) < r^{x), there is 
a pair {u, w) £ Graph F different from (x, v) such that i3.1\) holds. 

The theorem offers a very simple geometric interpretation of the regularity phenomenon: 
it means that F is regular if for any (x, v) £ Graph F and any y v there is a point in 
the graph whose T-component is closer to y (than v) and the distance from the new point 
to the original point (x, v) is proportional to the gain in the distance to y. 

Proof. We have to verify that, given (x,U) G Graph F with x £ U, y £ V and 0 < 
d{y,v) < rt, t < mix), there is a n G B{x,t) such that y £ T(tt). We have tpy{x,v) < rt. 
By Ekeland’s variational principle (see e.g. m) there is a pair {x,v) £ Graph F such 
that d^{{x,v), {x,v)) < t and 

(Py{x, v) + rd^{(x, v), (x, v)) > ipy{x, v) (3.2) 

if (x,x) {x,v). We claim that ipy{x,v) = 0, that \s y = v £ F{x). Indeed, x £ U, so 

by the assumption if y / D, there is a pair {u, w) (x, v) and such that ()3.ip holds with 
ix,v) as (x,x) which however contradicts (j3.2h . This proves the first statement. 

Assume now that F is Milyutin regular on [/ x 1/ with the surjection modulus not 
smaller than r. Take a positive ^ < r~^ and x £ U, y £ V, v £ F{x) with d{y, v) < rj(x). 
Take a small e £ (0,r) and choose a t G (0, m(x)) such that (r — e)t < d{y,v) < rt. By 
regularity there is a rt such that d{u,x) < t and y £ F{u). Note that t > f^d{y,v) by the 
choice of So setting w = y we have t > f,d{v, w) and 

d{y, w) = 0 < d{y, v) - {r - s)t < d{y, v) - {r - s)d^{{x, v), (u, w)). 

Since e can be chosen arbitrarily small, the result follows. □ 

Theorem 3.2 (second criterion for Milyutin regularity). Let X be a complete metric 
space, U <£ X and V d Y open sets and F : X ^ Y a set-valued mapping with closed 
graph. Then F is Milyutin regular on U x V with surmF{U\V) > r if and only if for any 
x £U and any y £V with 0 < ipyix) < rm{x) there is a u x such that 

fjyiu) < ipy{x) - rd{x, u). (3.3) 

Proof. The proof of sufficiency is similar to the proof of the first part of the previous 
theorem. 

To prove that (|3.3I1 is necessary for Milyutin regularity take x £ U, y £ V such that 
0 < d{y, F{x)) < rm{x). Take p < r such that still d{y,F{x)) < pm{x), and let p < p' < r. 
Let Xn —>■ X be such that d{y,F{xn)) f’yix). We may assume that d{y,F{xn)) < rm{x) 
for all n. Choose positive <5^ —>• 0 such that d{y,F{xn)) < (1 + 6n)'ipy{x), and let tn 

_ O 

be defined by p'tn = (1 + 5n)'ipy{x). Then y £ B{F{xn), p'tn), tn < m{xn) (at least for 
large n) and due to the regularity assumption on F for any n we can find a Un such that 
d{un,Xn) < tn and y £ F{un). Note that Un are bounded away from x for otherwise (as 
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Graph F is closed) we would inevitably conclude that y G F(x) which cannot happen as 
'ipyix) > 0. This means that = d{un,Xn)/d{un,x) converge to one. Thus 


'lljy{Un )=0 = ll)y{x) -iPy{x) ='iPy{x) - 


p'tr 


1 + 


< i’yix) - 
= i^yix) - 


1 + 5, 


^nP 


T 


-d{Un,Xn) 

-d{Un,x) < Ipyix) - pd{Un, x), 


1 + 5n 

the last inequality being eventually true as Xnp' > pil + 5^) for large n. 


□ 


The theorem is especially convenient when ipy is lower semicontinuous for every y G 1/. 
Otherwise, the need for preliminary calculation of ipy, the lower closure of ipy, may cause 
difficulties. It is possible however to modify the condition of the theorem and get a 
statement that requires verification of (1 ,1.31) -like inequality for ^p rather than ip, although 
at the expense of some additional uniformity assumption. 

Theorem 3.3 (modified second criterion for Milyutin regularity). Let X, Y, F, U and 
V be as in Theorem \3.2[ A necessary and sufficient condition for F to be Milyutin regular 
on U X V with surF(x|y) > r is that there is a A G (0,1) and for any x G U and y G V 
with 0 < Ipyix) < rmix) there is a u ^ x such that 


f’yiu) < tpyix) — rdix,u), Xipyiu) < Xipyix). 


(3.4) 


Proof. The key for understanding the theorem is the following implication 


Ipyix) = 0 +> y G Fix) (3.5) 

of course valid, under the condition of the theorem for x G 17, y G V. Indeed, ipyix) = 0 
means that there is a sequence (xn) converging to x such that ipyixn) —t 0. This in turn 
implies the existence of Vn G T(x„,) converging to y. As the graph of F is closed, it follows 
that (x, y) G Graph F as claimed. 

Now we can verify that under the assumptions of the theorem, the condition of Theorem 
[Q holds. So let X G 17, y G V and 0 < a = ipyix). Take Xn ^ x such that ipyixn) = 
an ^ a. and for each n a Un such that ipyiun) < Xon and ipyiun) < ipyixn) — rdixn,Un). 
An easy calculation shows that 


IpyiUn) < fjyix) - rdix, Un) + En, 

where —>• 0. As d(x, Un) are bounded away from zero by a positive constant, we have 
En = dndix,Un), where 6n —^ 0. Combining this with the above inequality, we conclude 
that for any r' < r that Un ^ x and inequality 

f’yiun) < fjyix) - r'dix, Un) 

holds for sufficiently large n. This allows to apply Theorem 13.21 and conclude (by virtue of 
(j3.5l) l that there is a rc G Bix, (r')“^) such that y G Fix), that is surmT(17|I7) > r'. □ 
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Note that the proof of necessity in the two last theorems does not differ from the proof 
of Theorem 13.11 Corresponding criteria for local regularity are immediate. 

Theorem 3.4 (criterion for local regularity). Let F : X ^ Y be a set-valued mapping 
with closed graph, and let {x,y) G Graph F. Then F is regular near (x, y) if and only 
if there are e > 0 , ^ > 0 and r > 0 such that for any x, v and y satisfying d{x,x) < 
e, d{y,y) < e, v £ F{x) and 0 < d{y,v) < s either of the following two properties is valid: 

(a) Graph F is locally complete and there is a pair {u,w) G Graph F, {u,w) ^ (x,u) 
such that 113.1\) holds. 

(b) X is a complete metric space, the graph of F is closed and either i3.3[) or 
holds true. 

Moreover, in either case surF(x|y) > r. 

Theorem l3.1l is a particular case of the criterion for 7 -regularity proved in [92]. Theorem 
13.41 is a modification of the result established in [ 88 ]. Theorem l3.2l is new but it was largely 
stimulated by a recent result of Ngai-Tron-Thera [136] (see Theorem 13.121 later in this 
section) and by a much earlier observation by Cominetti [JT] that ipy{x) = 0 implies that 
y G F{x). Surprisingly, it has been recently discovered that sufficiency in statement of 
the part (a) of the local criterion (Theorem 13.4p is present as a remark in a much earlier 
paper by Fabian and Preiss [63] . 

The completeness assumption in the first theorem differs from the corresponding as¬ 
sumption of the other two theorems. So the natural question is if and how they are 
connected. It is an easy matter to see, in view of Proposition 12.91 that Theorem 13.11 
follows from Theorem 13.21 On the other hand, Theorem 13.11 is easier to use as it does 
not need a priori calculation of any limit or verification of the existence of A as in the 
third theorem. However, if the functions d{y,F{-)) are lower semicontinuous, the second 
criterion may be more convenient. It should also be observed that the theorems can be 
equivalent in some cases (as follows from Proposition 1.5 in [ 88 ]). 

3.2 An application: density theorem. 

Here is the first example demonstrating how handy and powerful the criteria are. 

Theorem 3.5 (density theorem [35[ [92]). Let U C X and V C Y be open sets, let 
F : X ^ Y be a set-valued mapping with complete graph. We assume that whenever 
X G U, v G F{x) and t < m{x), the set F{B{x,t)) is a it-net in B{v,rt){^V, where 
0 < £ < r. Then F is Milyutin regular on U x V and sur^T > r — £ . In particular, if 
F{B{x,t)) is dense in B{F{x),rt) f]V for x G U and t < m{x), then surmT(C/|I/) > r . 

Proof. Take x G U and suppose y G V is such that d{y,F{x)) < rm{x). Take a u G F{x) 
such that d{y,v) < rm[x) and set t = d{y,v)/r. Then t < m{x) and by the assumption 
we can choose {u,w) G Graph F such that d{x,u) < t and d{y,w) < £t = {£/r)d{y,v). 
Then 

£ 

d{v, w) < d{y, v) -k d{y, w) < {l + -)d{y, v) < 2d{y, v). 

Take a ^ > 0 such that fr < 1/2. Then f,d{y, w) < 2^rt < t and therefore 

d{y, w) < £t = rt — {r — £)t = d{y, v) — (r — l)t < d{y, v) — {r — £)d£,{{x, v),d{u, w)). 


3.4) 
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Apply Theorem 13.II 


□ 


Exercise 3.6. Prove the theorem under the assumptions of Theorem 13.21 rather than 
Theorem 13.11 

Exercise 3.7. Prove Banach-Shauder open mapping theorem using the density theorem 
(and the Baire category theorem) 

The specification of Theorem 13.51 for local regularity at (x,y) is 

Corollary 3.8 (density theorem - local version). Suppose there are r > 0, and e > 0 such 
that F{B{x,t)) is an It-net in B{v,rt) whenever d{x,x) < e, d{v,y) < e, v G T(x) and 
t < e. Then surF{x\y) >r — i. Thus if B{v,rt) C clF{B{x,t)) for allx, v and t satisfying 
the specified above eonditions, then B{v,rt) C F{B(x,t) for the same set of the variables. 

The density phenomenon was extensively discussed, especially at the early stage of 
developments. Results in the spirit of Corollary 13.81 were first considered in Ptak |146j . 
Tziskaridze |166j and Dolecki |46[l47j in mid-1970s. The very idea (and to a large extent the 
techniques used) could be traced back to Banach’s proof of the closed graph/open mapping 
theorem. Some of the subsequent studies (e.g. [26( 1169] ) were primarily concentrated 
on results of such type. We refer to [I^ for detailed discussions and many references. 
Dmitruk-Milyutin-Osmolovskii in [55] made a substantial step forward when they replaced 
(in the global context) the density requirement by the assumption that F{B(x),t) is an 
it-net in B{F{x),rt). This opened way to proving the Milyutin perturbation theorem (see 
the next section). A similar advance in the framework of the infinitesimal approach (for 
mappings between Banach spaces) was made by Aubin [5]. 


3.3 Infinitesimal criteria. 


The main tool of the infinitesimal regularity theory in metric spaces is provided by the 
concept of (strong) slope - which is just the maximal speed of descent of the function 
from a given point - introduced in 1980 by DeGiorgi-Marino-Tosques m and since then 
widely used in various chapters of metric analysis. 


Definition 3.9 (slope). Let / be an extended-real-valued function on X which is finite 
at X. The quantity 


|V/|(x) = limsup 


u—¥x 

u^x 


(fix) - /(u)) + 


d{x, u) 


is called the (strong) slope of / at x. We also agree to set |V/|(x) = oo if /(x) = oo. The 
function is called calm at x if |V/|(x) < oo. 


We shall consider only local regularity in this subsection (although it is possible to 
give slope-based characterizations of Milyutin regularity as well). It is easy to observe 
that |V/|(x) > r means that arbitrarily close to x there are u x such that f{x) > 
f{u) + rd{x,u). This allows to reformulate the sufficient part of the regularity criteria of 
Theorem 13.41 in infinitesimal terms. To this end set as before 


(Pyix,v) = d{y,v) ZGraph f{x,v), 'ifyix) = d{y,F{x)), ipy{x) = liminf (u), 
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and let stand for the slope of functions on X x y with respect to the d^-metric: 
d^{{x, v), {x',v')) = max{d(x, x'),^d{v, v')). 

Things are more complicated with the necessity part: to prove it, an additional as¬ 
sumption on the target space is needed. Namely, let us say that a metric space X is locally 
coherent if for any x 

lim |Vd(u, ■)\{w) = 1 . 

u,w —>• X 
u^w 

It can be shown that a convex set and a smooth manifold in a Banach space are locally 
coherent in the induced metric [89] and that any length metric space (space whose metric 
is defined by minimal lengths of curves connecting points) is locally coherent (as follows 
from (Hj). 

Theorem 3.10 (local regularity criterion 1 (89]). Let X and Y be metric spaces, let 
F : X ^ Y be a set-valued mapping, and let {x,y) G Graph F. We assume that Graph F 
is locally complete at {x,y). Suppose further that there are e > 0, and r > 0 such that for 
some ^ > 0 

\V^iPy\{x,v) > r (3.6) 

if 

V £ F{x), d{x,x)<s, d{y,y)<e, d{v,y) < e, v^y. (3.7) 

Then F is regular near (x, y) with surF(x, y) > r. 

Conversely, let Y be locally coherent aty. Assume that surT(x|y) > r > 0. Take a 
^ < r~^. Then for any 6 > 0 there is an s > 0 such that |Vjy? 2 y|(x, u) > (1 —(5)r whenever 
ix,y,v) satisfy B- Thus, in this case 

suiF{x,y) = lim inf __ \X eipy\{x, v). (3.8) 

(x,v) ^ ^ (x,y) 

Graphic 
y^y, yj^v 

For mappings into metrically convex spaces (for any two points there is a shortest path 
connecting the points) the hnal statement of Theorem 13.101 can be slightly improved. 

Corollary 3.11. Suppose under the conditions of Theorem \S.10\ that Y is metrically 
convex. Then for any neighborhood V of y 


swcF{x, y) 


lim inf 

{x,v) {x,y) 

GraphF 




(3.9) 


Theorem 3.12 (local regularity criterion 2). Suppose that X is complete and the graph 
of F is closed. Assume further that there are neighborhood U C X ofx and V C Y ofy, 
r > 0 and e > 0 such that that \X'f)y\(x) > r for all {x,y) G UxV such that e > tpyix) > 0. 
Then suiF{x\y) > r. 

Conversely, if in addition Y is a length space and surF(x|y) > r > 0, then there 
is a neighborhood of {x,y) and an e > 0 such that {XipyKx) > r for all {x,y) of the 
neighborhood such that y ^ F{x) and 0 < ifyix) < sr. Thus in this case 

surT(x|y) = lim inf \X^py\{x). 

(x,y)^{x,y) ^ 

Oi^d{y,F{_x))^Q, 
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In particular, if ipy = d{y,F{-)) is lower semicontinuous at every x of a neighborhood 
ofx and for every y ^ F{x) close to y, then 

surF(x|y) = liminf \V'il)y\{x). 

0^d(y,F(a:))^0 

The starting point for developing slope-based regularity theory was the paper by Aze- 
Corvellec-Lucchetti [15] (its first version was circulated in 1998) who obtained a global 
error bound in terms of ’’variational pairs” that include slope on a metric space as a 
particular case. Theorem 13.101 and specifically the fact that the slope estimate is precise, 
was proved in [88| under somewhat stronger condition (equivalent to Y being a length 
space). We refer to [TO] for a systematic exposition of the slope-based approach to local 
regularity. Theorem 13.121 is a slightly modified version of the mentioned result of Ngai- 
Tron-Thera |136j (proved originally for Y being a Banach space). 

To explain how the additional assumption on Y is used to get necessity e.g. in Theorem 
13.101 let us consider, following the original argument in [88], {x,y,v) sufficiently close to 
X and y respectively and such that y ^ v ^ F{x). For any n take 6n = o{n~^) and a 
such that d{vn, v) < (n“^ -|- 5n)d{y, v) and d{vny) < (1 — -|- 5n)d{y, v). If T is a length 

space such Vn can be found. As F covers near (x,y) with modulus greater than r, there 
is a Un such that Vn G F{un) and d{un,x) < r~^d{vn,v) 0 when n —>■ oo. We have 
\d{y,v) - {d{y,Vn) + d{v,Vn))\ = o{d{vn,v)). Therefore (as r^ < 1) 

lim > = 

n->-oo max{d{Un,X),^d{Vn,V)} n-^oo r ^d{Vn,V) 

Similar argument, modihed as the definition of fjy includes a limit operation, can be used 
also for the proof of necessity in Theorem 13.121 

It should be observed that the class of locally coherent spaces is strictly bigger than 
the class of length spaces. For instance a smooth manifold in a Banach space with the 
induced metric is a locally coherent space but not a length space (unless it is a linear 
manifold). 

3.4 Related concepts: metric subregularity, calmness, controllability, 
linear recession 

In the dehnitions of the local versions of the three main regularity properties we scan 
entire neighborhoods of the reference point of the graph of the mapping. Fixing one or 
both components of the point leads to new weaker concepts that differ from regularity in 
many respects. Subregularity and calmness attract much attention last years. We refer to 
m for a detailed study of the concepts mainly for mappings between finite dimensional 
spaces, and begin with parallel concepts relating to linear openness which are rather new 
in the context of variational analysis. We skip (really elementary) proofs of almost all 
results in this subsection. 

Definition 3.13 (controllability). A set valued mapping F : A ^ T is said to be (locally) 
controllable at {x,y) if there are e > 0,7 > 0 such that 

B{y,rt) C F{B{x,t)), if 0 < t < e. (3.10) 
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The upper bound of such r is the rate or modulus of controllability of F at (x, y). We shall 
denote it contrT(x|y) and contrF(x) if F is single-valued. 

Proposition 3.14 (Regularity vs. controllability). Let X and Y be metric spaces, let 
F : X have locally complete graph, and let (x, y) G Graph F. Then 

surF(x|y) = lim inf{contrF(x|y) : {x,y) € Graph F, max{fi(x, x), d(y, y)} < e}. (3.11) 
£->0 

Definition 3.15 (linear recession). Lets us say that F recedes from y at {x,y) at a linear 
rate if there are e > 0 and K >t) such that 

d{y,F{x)) < Kd{x,x), if d{x,x) < e. (3.12) 

We shall call the lower bound of such K the speed of recession of F from y at {x, y) and 
denote it ressT(x|y) 

The other possible way to “pointify” the Aubin property is to fix x and allow (x, y) to 
change within Graph F. Then, instead of (j3.12p we get the inequality 

d{y,F{x)) < Kd{x,x). (3.13) 

Definition 3.16 (calmness). It is said that F : X ^Y is calm at (x, y) if there are e > 0, 
K > 0 such that (I3.13P holds if d{x,x) < e, d{y,y) < e and y G F{x). The lower bound 
of all such K will be called the modulus of calmness of F at (x,y). We shall denote it by 
calmF(x|y) (calmT(x) if F is single-valued). 

Again we can easily see that uniform calmness, that is calmness at every (x, y) of the 
intersection of Graph F with a neighborhood of (x, y) with the same e and K for all such 
(x,y), is equivalent to the Aubin property of F near (x,y). 

Definition 3.17 (subregularity). Let F : X ^ Y and y G F(x). It is said that F is 
(metrically) suhregular at (x,y) if there is a A > 0 such that 

d{x, F~^{y)) < Kd{y,F{x)) if d{x,x) < e. (3-14) 

for all X of a neighborhood of x. The lower bound of such K is called the rate or modulus 
of subregularity of F at (x,y). It will be denoted subregF(x|y). 

We say that F is strongly subregular at (x, y) if it is subregular at the point and 
y 0 F{x) for x 7 ^ x of a neighborhood of x. 

Proposition 3.18. The equalities 

subregA(x|y) = calmA“^(y|x), contrF(x|y) • ressF“^(y|x) = 1 
always hold. If moreover, F is strongly subreglar at (x,y), then 

contrT(x|y) • subregT(x|y) > 1 . 
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Theorem 3.19 (slope criterion for calmness). Let X and Y he arbitrary metric spaces, 
let F : X be a set-valued mapping with closed graph and let {x,y) G Graph F. Then 

calmF(x|y) > limsup 
y^y 


where, as earlier, '4’y{x) = d{y,F{x)). 

Proof. Let K > calmF(x|y) then there is an e > 0 such that (13.131) holds, provided 
d{x,x) < e and y G F{x). To prove the theorem, it is sufficient to show that \Xipy\(x) < K 
for all y sufficiently close to y. To this end, it is sufficient to verify that there is a <5 > 0 
such that the inequality 


d{y, F{x)) - d{y, F{x)) < Kd{x, x) 

holds for all x,y satisfying d{x,x) < 5, d{y,y) < 6. 

If y G F{x), then (I3.14h reduces to (I3.13h . Take a positive 6 < e/2, and let x and y 
be such that d{x,x) < 6 , d{y,y) < 6 . If d{y,F{x)) > 5, then (|3.14p obviously holds. If 
d{y,F{x)) < 5, we can choose a u G F{x) such that d{y,v) < 5. Then d{v,y) < e and 
therefore d{v,F{x)) < d{x,x). Thus 

d{y,F{x)) - d{y,F{x)) < d{y, v) + d{v, F{x)) - d{y, F{x)) 

< Kd{x,x) + d{y,v) - d{y,F{x)) 

and ()3.14p follows as d{y,v) can be arbitrarily close to d{y,F{x)). □ 

Theorem 3.20 (slope criterion for subregularity). Assume that X is a complete metric 
space. Let F : X ^ Y be a closed set-valued mapping and (x, y) G Graph F. Assume 
that the function ijj{x) = d{y,F{x)) is lower semicontinuous and there are e > 0 and r > 0 
such that 

\Vf;y\{x) = \Vdiy,F{-))\ix)>r, 

if d{x,x) < e and 0 < d{y,F{x)) < e. Then F is subregular at {x,y) with modulus of 
subregularity (and hence the modulus of calmness of F~^ at {y,x)) not greater than r~^. 

4 Metric theory. Perturbations and stability. 

In this section we concentrate on two fundamental questions: 

(a) what happens with regularity (and subregularity) properties of F if the mapping 
is slightly perturbed? 

(b) how the set of solutions of the inclusion y G F{x,p) (where F depends on a 
parameter p) depends on {y,p)l 

The answer to the second question leads us to a fairly general implicit function theorems. 
The key point in both cases is that we have to require a certain amount of Lipschitzness 
of perturbations to get desirable results. 
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4.1 Stability under Lipschitz perturbation 

Theorem 4.1 (stability under Lipschitz perturbation). Let X, Y he metric spaces, let 
U C X and V C Y be open sets. Consider a set-valued mapping ^ : X x X ^ Y with 
closed graph assuming that either X or the graph of ^ is complete. Let F{x) = ^{x,x). 
Suppose that 

(a) for any u € U the mapping 4'(-,u) is Milyutin regular on {U,V) with modulus of 

O 

surjection greater than r, that is for any x G U, any v G '^{x, u) and any y G B{v, rt) n V 
with t < d{x,X\U) there is an x' such that d{x,x') < r~^d{y,v) and y G F{x')] 

(h) for any x G U the mapping is pseudo-Lipschitz on {U,V) with modulus 

£ < r, that is for any u,w G U 

ex(4'(x, u) n V,'^{x, w)) < id{u, w). 

Then F{x) = 4'(x, x) is Milyutin regular on {U, V) with swCmF{U\V) > r — £. 

Proof. We shall consider only the case of complete Graph 4'. According to the general 
regularity criterion of Theorem 13.11 all we have to show is that there is a ^ > 0 such that, 
given (x, v) G grF and y such that x G U, y G V and 0 < d{y, v) < rm{x), there is another 
point {x',v') 7 ^ {x,v) in the graph of F such that 

d{y, v') < d{y, v) — {r — i) max{(i(x, x'),f,{v, x'))}. 

We have by (a): B{v,rt) C 4'(i?(x, t), x) if t < m(x). As d{y,v) < rm{x), it follows 
that there is a x' G B{x,t) such that y G 4'(x',x) and d{x,x') < r~^d{y,v). 

Clearly, x' G U. Therefore by (b) d{y,'^{x',x')) < £d{x,x'). This means that there is 
a v' G F{x') such that 

p 

d{y,v') < ld{x,x') < -d{y,v). 

r 

Take ^ < (r + £)~^. Then 

id{v,v') < [r + £.)~^{d{v,y) + d{y,v')) < (r + + -)d{y,v) = -d{y,v). 

\ r / r 

Thus max{d{x,x'),(,d{v,v')} < r~^d{y,v) and we have 

T — I 

d{y, v') < {£lr)d{y, v) = d{y, v) - d{y, v) < d{y, v) - {r - 1) max{d{x, x'),fd{v, x')}. 

r 

as needed. □ 

Corollary 4.2 (Milyutin’s perturbation theorem [l5]). Let X be a metric space, let Y 
be a normed space and F : X ^ Y and G : X ^ Y We assume that either the graphs 
of F and G are complete or X is a complete space. Let further U (Z X be an open set 
such that F is Milyutin regular on U with suiF{U) > r and G is (Hausdorff) Lipschitz 
with lipG(C/) < £ < r. If either F or G is single-valued continuous on U, then F + G is 
Milyutin regular on U and sur(T + G){U) >r — £. 

Proof. Apply the theorem to 'L(x,n) = F{x) + G{u). □ 

To state a local version of the theorem, we need the following 
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Definition 4.3 (uniform regularity). Let P be a topological space, let P : PxX ^Y, let 
p & P, and let {x,y) G Graph F{p,-). We shall say that F is regular near (x, y) uniformly 
in p G P near p if for any r < surP(p, ■)(x\y) there are e > 0 and a neighborhood W G P 

of p such that for any p G W and any x with d{x,x) < e 

B{F{p,x),rt) n B{y,e) C F{p, B{x,t)), if 0 < f < e. 

Theorem 4.4 (stability under Lipschitz perturbations: local version). Let X, Y, : 

X X X ^ Y and F{x) = 'I'(x,x) he as in Theorem \4.1\ and let {x,y) G Graph F. We 

assume that 

(a) '!'(•, rt) is regular near {x,y) uniformly in u nearx; 

(b) 'I'(x,-) is pseudo-Lipschitz near {x,y) uniformly in x nearx. 

//lip'I'(x, •)(x|y) <£ <r < sur4'(-, x)(x|y), then F is regular near {x,y) with modulus of 
surjection greater than r — i. 

The last theorem in turn immediately implies Milyutin’s theorem and its versions 
correspond to 4'(x,y) = P(x) + g{y) with g being single-valued Lipschitz. The following 
corollary from the theorems is straightforward 

Theorem 4.5 (Milyutin’s perturbation theorem - local version). Let X he a metric space, 
let Y be a normed space, and let F : X ^Y and G : X z^Y. Given x G dom Pn dom G, 
y G F{x), z G G{x), we assume that F is regular near {x,y) with surP(x|y) > r and G 
has the Aubin property near {x,z) with \vpG{x\z) < £. If either F or G is single-valued 
continuous on its domain and the graph of the other is complete in the product metric, 
then 


sur(P -P G){x,y -I z) > r — 

Proof. Set 'I>{x,y) = P(x) -|- G{y). It is an easy matter to check that the conditions of 
Theorem 14.41 are valid. □ 

As an immediate consequence of the last theorem we mention a stronger version of 
the Lyusternik-Graves theorem stating that its condition is not only sufficient but also 
necessary for regularity is an immediate corollary of the last theorem. 

Corollary 4.6 (Lyusternik-Graves from Mulyutin). Let X and Y be Banach spaces, and 
let F : X ^ Y be strictly differentiable at x. Then surP(x) = C{F'{x)). 

Proof. Indeed, let X, Y be Banach spaces, and let P -. X ^ Y he strictly differentiable 
at X. Set g{x) = P(x) — F'{x){x — x). As P is strictly differentiable at x, the Lipschitz 
constant of 5 : at x is zero which by Milyutin’s theorem means that the moduli of surjection 
of P at X and F'{x) coincide. □ 

We observe next that in Theorem 14.51 one of the mappings is assumed single-valued. 
This assumption is essential. With both mappings set-valued the result may be wrong as 
the following example shows. 
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Example 4.7 (cf. [55]). Let X = Y = M, G{x,y) = {x^,—1}, F{x) = {—2x,l}. It is 
easy to see that F is regular near (0,0) and G is Lipschitz in the Hausdorff metric. On 
the other hand, 

= {x^ — 2 x, x^ + 1 , — 2 x — 1 , 0 } 

is not even regular at (0,0). Indeed (^,0) E Graph <I> for any However, if ^ 7 ^ 0, then 
the <I>-image of a sufficiently small neighborhood of ^ does not contain points of a small 
neighborhood of zero other than zero itself. 

Perturbation analysis of regularity properties was initiated by Dmitruk-Milyutin-Osmolovski 
in |45] with a proof of a global version of Theorem 14.21 f attributed in |l5] to Milyutin) with 
both the mapping and the perturbation single valued. The first perturbation result for 
set-valued mappings was proved probably by Ursescu |168j (see also [ 88 ]). Observe that 
global theorems are valid for Lipschitz set-valued perturbations as well. 

Till very recently the main attention was devoted to additive perturbations into a 
linear range space, especially in connection with implicit function theorems for generalized 
equations - see e.g. HUES]. Interest to non-additive Lipschitz set-valued perturbations of 
set-valued mappings appeared just a few years ago, partly in connection with fixed point 
and coincidence theorems mEn i92i |97] 

The Graves theorem can be viewed as a perturbation theorem for a linear regular oper¬ 
ator. For that reason in some publications (e.g. |50l 155] i this theorem is called ’’extended 
Lyusternik-Graves theorem”. I believe the name ’’Milyutin theorem” is adequate. It is 
quite obvious that Graves did not have in mind the perturbation issue and was interested 
only in a quality of approximation needed to get the result. (Tikhomirov and I a had sim¬ 
ilar idea when proving the metric regularity counterpart of the Graves theorem for |103j 
without any knowledge of the Graves’ paper.) And the fact that the Lipschitz property of 
the perturbation as the key for the estimate was explicitly emphasized in |l5]. Note also 
that even Gorollary 14.61 cannot be obtained from the Graves theorem. 

Milyutin’s theorem can also be viewed as a regularity result for a composition <^(x, F(x)), 
where <h(x, y) = G{x)+y. Theorems 14.II and 14.41 can be applied to prove regularity of more 
general compositions, with arbitrary $, just by taking 'L(x,n) = ^{x, F{u)). However, a 
certain caution is needed to guarantee that such a T satisfies the required assumptions 
(as say in [92] where <l>(x, •) is assumed to be an isometry or in [61] where a certain ’’com¬ 
position stability” is a priori assumed). Corollary 14.61 was hrst stated in |49j with a direct 
proof, not using Milyutin’s theorem. 

4.2 Strong regularity and metric implicit function theorem. 

Generally speaking, the essence of the inverse function theorem is already captured by 
the main Equivalence Theorem 12.71 But in view of the very special role of the inverse 
and implicit function theorems in the classical theory, it seems appropriate to make the 
connection with the classical results more transparent. 

So let F{x,p) : X X P ^ Y. We shall view P as a parameter space. Let S{y,p) = 

{x E X : y G F{x,p)} stand for the solution mapping of the inclusion y E F{x,p). In all 
theorems to follow we consider Y x P with an ^^-type distance 

= ad{y,y') + d{p,p'), 
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where a will be further determined by Lipschitz moduli of mappings involved. 

Theorem 4.8 (general proposition on implicit functions). We assume thaty G F{x,p) and 
F satisfies the following conditions: there are constants K > 0, a > 0 and a sufficiently 
small e > 0 such that the following relations hold: 

(a) F{-,p) is regular near ({x,y),p) uniformly in p with the rate of metric regularity 
not grater than K; 

(b) F{x, •) is pseudo-Lipschitz near {x, {p,y)) uniformly in x with the Lipschitz modulus 
not greater than a. 

Then S has the Aubin property near {{y,p),x) with the Lipschitz modulus with respect to 
the metric d]^ in Y x P not greater than i'egF{-,p){x,y). 

In particular, if we are interested in solutions of the inclusion y G F{x,p) (with fixed 
y), then under the assumption of the theorem the solution mapping p i— ?■ Sy{p) has the 
Aubin property near {p,x) with Lipschitz modulus not exceeding Ka. 


Proof. As F{x,p) / 0, the uniform pseudo-Lipschitz property implies that S{y,p) 0 for 
{y,p') close to (jj,p). If now y G F{x,p), then 

d{x,S{y',p')) < Kd{y',F{x,p'))<K{d{y,y')+d{y,F{x,p'))) 

< K{diy,y') + ad{p,p')) = Kdi{{y,p), {y',p')) 

= Ka{d{p,p') + a-^d{y,y')), 


and the proof is completed. 


□ 


Definition 4.9. Let F : X ^Y, and let y G F{x). We say that F is strongly (metrically) 
regular near (x, y) G Graph F if for some e > 0, d > 0 and K G [0, oo) 


B{y,6) C F{B{x,e)) Sz d{x, u) < Kd{y, F(x)) (4.1) 


whenever x G B{x,s), u G B{x, s) and y G F{u)f) B{y, 6). 

We shall also say following m that F has a single-valued localization near {x,y) if 
there are e > 0, 5 > 0 such that the restriction of F{x) n B{y,6) to B{x,e) is single¬ 
valued. If in addition, the restriction is Lipschitz continuous, we say that F has Lipschitz 
localization near (x, y) . 

It is obvious from the definition that strong regularity implies regularity: the second 
relation in (14.ip is clearly stronger than metric regularity. 

Proposition 4.10 (characterization of strong regularity). Let F : X ^ Y and (x,y) G 
Graph F. Then the following properties are equivalent 

(a) F is strongly regular near {x,y); 

(b) there are e > 0 and 6 > 0 such that B{y, 6)subsetF{B(x,e)) and 

F(x)f|F(u)f|i?(y,5) = 0, (4.2) 

whenever u x and both x and u belong to B(x,e); 

(c) F is regular near {x,y) and there are e > 0, <5 > 0 such that F~^ has a 
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single-valued localization near {y,x); 

(d) F~^ has a Lipsehitz localization G{y) near {y,x). In particular y € F{G{y)) 
for all y of a neighborhood ofy. 

Moreover, if F is strongly regular near {x,y), then the lower bound of K for which the 
second part of (O holds and the Lipsehitz modulus of its Lipsehitz localization G at y 
coincide with regF{x\y). 

Theorem 4.11 (persistence of strong regularity under Lipsehitz perturbation). We con¬ 
sider a set-valued mapping ^ : X ^ Y with complete graph, and a (single-valued) mapping 
G : X xY ^ Z. Let y G and z = G(x,y). We assume that 

(a) ^ is strongly regular near {x,y) with sur<l)(x|y) > r; 

(b) G{x,-) is an isometry from Y onto Z for any x of a neighborhood ofx; 

(c) G{-,y) is Lipsehitz with constant i < r in a neighborhood ofx, the same for 
all y of a neighborhood ofy. 

Set F{x) = G'(x,<^(x)). Then F is strongly regular near (x,z). 

In particular, ifY is a normed space, $ is strongly regular near {x,y) G Graph <1> and 
G{x,y) = g{x) + y with lip 5 f(x) < sur<l>(x|y), then F{x) = 4>(x) + g{x) is strongly regular 
near {x,y -I g{x)). 

Remark 4.12. It is to be observed in connection with the last theorem that strong 
regularity is not preserved under set-valued perturbations like those in Theorem 14.11 Here 
is a simple example: 


4'(a:, n) = x -\- 1,1] (x, u G M), x = 0. 

Clearly '!'(•, 0) is strongly regular but F{x) = x -\- x^[—1,1] is of course regular but not 
strongly regular. 

It follows that strong regularity is somewhat less robust compare to the standard 
regularity. 

Theorem 4.13 (implicit function theorem - metric version). Assume in addition to the 
assumptions of Theorem \4-^ that 

F{x,p) n F{x',p) Cl B{y,e) = $ V x,x'G i?(x,e), ,x ^ x', p ^ B{p,e). (4.3) 

Then the solution map S has a Lipsehitz localization G near {{p,y),x) with lipG{p,y) < K 
(with respect to the dl^-metric inY x P. In particular z G F{S{p,y),y) for all {p,y) of a 
neighborhood of{p,y). 

The conclusion is already very similar to the conclusion of the classical implicit function 
theorem. Indeed, it contains precisely the same information about the solution, namely 
its uniqueness in a neighborhood and its Lipsehitz continuity (replacing differentiability) 
with the Equivalence Theorem l2.7l providing. along with the concluding part of Proposition 
14.101 an estimate for the Lipsehitz constant of the solution map (replacing the formulas 
for partial derivative in the classical theorem). Moreover, the proof below is based on the 
same main idea as the proof of the classical theorem, say the second proof in |55j . 
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Proof. Consider the set-valued mapping <1> from X x P into P xY . defined by 

<h(x,p) = {p} X F{x,p). 

Then {p,y) G <h(x,p). We claim that is strongly regular near {{x,p), {p,y)). Indeed, we 
have for x, p, y sufficiently close to x,p,y 

^-^{x,y) = {p} X S{p, y) (4.4) 

By Theorem 14.81 S has the Aubin property at {{p,y),x). This obviously implies that 
has the Aubin property at {{p,y), {x,p). The latter means that <I> is regular at 
{{x,p), {P,y))- 

On the other hand, {p,y) € <I>(x,p) 0 4>(x',p') means that p = p' and y € F{x,p) 0 
F{x',p), so that (j4.3p may happen only if x = x' . This proves the claim. 

By Proposition 14.lOl there is a Lipschitz localization of defined in a neighborhood 
of {p,y). By (14.3|] this localization has the form (p, G{p, y)), where G{p, y) G S{p, y). Thus 
G is a Lipschitz localization of S and by Theorem 14.81 its Lipschitz constant is not greater 
than K. □ 

Theorem 4.14 (metric infinitesimal implicit function theorem). Let y G F(x,p), and 
assume that there are .^>0, r>0, ^>0, e>0 are such that for all x,y,p,v satisfying 

d{x,x) < e, d{y,y) < e, d{p,p) < e, 

either Graph F is complete and 

(ai) \V^py{-,p)\{x,v) > r a V ^ F{x,p) and d{y,v) > 0 
or X is a complete space and 

(a 2 ) \X^py{■,p)\{x) > r a ^py{x,p)>0 
holds along with 

O 

(b) \X^py{x,■)\{p) < id{p,p'), if y G F{x,p') for some p' G B{p,e). 

Then S has the Aubin property near {y,p) with lipS'((y,p)|x) < r~^ ifYxPis considered 
with the distance d\{{y,p),{y',p')) = id{p,p') + d{y,y'). 

The proof of the theorem consists in verifying the assumptions of Theorem 14.81 for all 
{x,y,p) of a neighborhood of {x,p,y) and p' close to p. 

The next theorem is an infinitesimal counterpart of Theorem 14.131 

Theorem 4.15. In addition to the conditions of Theorem \4 -141 we assume that 

(c) \Xipy{-,p)\{x) > 0 if y G F{x',p) for some x'^ X. 

Then S has a Lipschitz localization G in a neighborhood of {p,'z) with G{p,y) = x and the 
Lipschitz constant (with respect to the d\-metric in P xY) not exceeding r~^. 

O 

Proof. Indeed, it follows from (c) that y 0 F{x,p) that is {F{x,p)r\ F{x',p))r\ B(lJ,e) = 0 
for X, x' close to x and p close to p and the reference to Theorems 14.141 and 14.131 completes 
the proof. □ 
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There have been numerous publications extending, one way or another, the implicit 
function theorem to settings of variational analysis, see e.g |TT1 [55l [TOl |88l 11241 11351 
1136] , Most of them deal with Banach spaces and/or specific classes of mappings, e.g. 
associated with generalized equations. It should be also said that some results named 
“implicit function theorem” are rather parametric regularity or subregularity theorems 
giving uniform (w.r.t parameter) estimates for regularity rates of a mapping depending 
on a parameter. 

The concept of strong regularity was introduced by Robinson in [154j . A number of 
characterizations of strong regularity can be found in [55]. It is appropriate to mention 
(especially because we do not discuss these questions in the paper) that there are certain 
important classes of mappings for which regularity and strong regularity are equivalent. 
Such are monotone operators, in particular subdifferentials of convex functions, or Kojima 
mappings associated with constrained optimization [55[ 1109] . 

5 Banach space theory. 

Needless to say that the vast majority of applications of the theory of metric regularity 
relate to problems naturally stated in Banach spaces. Variational analysis and metric 
regularity theory in Banach spaces are distinguished by 

(a) the existence of an approximation mechanisms, both primal and dual, using homo¬ 
geneous mappings (graphical derivatives and coderivatives) in case of set-valued mappings 
or directional subderivatives and subdifferentials for functions; 

(b) the possibility of separable reduction for metric regularity that allows to reduce 
much of analysis to mappings between separable spaces; 

(c) the existence of a class of linear perturbations, most natural and interesting in 
many cases. 

5.1 Techniques of variational analysis in Banach spaces. 

5.1.1 Homogeneous set-valued mappings. 

Definition 5.1. A set valued mapping % : X homogeneous if its graph is a pointed 

cone. The latter means that 0 G 11(0). The mapping 

= {a;* : {x*,x) - {y*,y) < 0, V {x,y) G Graph n} 

is called adjoint or dual to j-L (or the dual convex process as it is often called for the reasons 
to be explained in the next chapter). It is an easy matter to see, that 
Graph T-L* = {{y*,x*) : {x*, —y*) G (Graph 11)°}. 

With every homogeneous mapping % we associate the upper norm 

||11||+ = sup{||y|| : y G 'H{x), x G dom H, ||x|| < 1}, 
and the lower norm 

\\n\\- = sup^^gsndom -Hinfilll/ll ^ V ^ = s^PxsBndom nd{0,'H{x)). 
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For single-valued mappings with dom % = X both quantities coincide and we may speak 
about the norm of T-L. The mapping T-L is bounded if ||^||+ < oo. This obviously means 
that there is an r > 0 such that 'H(x) C r||x||i?y for all x. 

Very often however, in the context of regularity estimates, it is more convenient to 
deal with different quantities dehned by way of the norms as follows: 

c{n) = \\n-^\\z^ and c*{n) = \\n-^\\i\ 

The quantities are respectively called the Banach constant and the dual Banach constant 
of "H. To justify the terminology, note that for linear operators they coincide with the 
Banach constants introduced for the latter in the first section. 

The proposition below containing important geometric interpretation of the concepts 
shows that the Banach constants are actually very natural objects.. 

Proposition 5.2 (cf. Proposition II.3p . For any homogeneous T-L : X 
C{'H) = contr^(0|0) = sup{r > 0 : rBy C ^{Bx)}] 

= (subreg^(0|0))“^ = inf{||?/|| : ||x|| = 1} = inf d{t),'H{x)). 

11 * 11=1 

Proof. The equality contr?^(0|0) = sup{r > 0 : rBy C 'H{Bx)} follows from homogeneity 
of Td. On the other hand, saying that rBy C T-L{Bx) is the same as saying that for any 
y with ||y|| = r there is an ||x|| with ||x|| < 1 such that x G 'H~^{y) which means that 
p'i- < r ^ and therefore ClFL) > contr'H(0|0). Likewise, \\T-L ^||_ < r ^ means that 
for any y with ||y|| = 1 there is an x with ||x|| < r~^ such that y G ^(x) from which we 
get that rBy C Td{Bx) and the hrst equality follows. 

To prove the second equality, consider hrst the case C*{H) < oo. Then 

= inf inf{||x||“^ : x G 'H~^{y)} 
lly||=i 

= inf{||2/|| : y G 'H(x), ||x|| = 1}. 

If C*{'H) = oo, and therefore ||^~^||+ = 0, then for any y the set 'H~^{y) is either 
empty (recall our convention: inf0 = oo, sup0 = 0) or contains only the zero vector. 
Hence the domain of is a singleton containing the origin. It follows that inf{||y|| : y G 
FL{x), ||x|| = 1} = inf0 = oo. 

This proves the left equality. Consider again the case C*{fH) > 0. Then ||^“^||+ < oo 
and consequently, = {0}. It follows that d{x,'H~^{t))) = ||x||. Setting K = 

{C*{'H)Y, we get for any x with ||x|| = 1: 

Kd{0,n{x)) > 1 = ||x|| = d{x,n-\o) 

and on the other hand for any K' < K we can hnd an x with ||x|| = 1 such that 
K'd{0,'H{x)) < 1. It follows that K = subreg'H(0|0). The case C*{'H) = 0 is treated 
as above. □ 

Corollary 5.3. For any homogeneous mappings FL : X and £ :Y ^ Z 

CiSoB) > C{£)-C{n). 
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Proof. Take p < C{'H). Then p{By) C ^{Bx) and therefore 


C{Eo'H) = sup{r > 0 : rBz C {S o'H){Bx)} 

> sup{r > 0 : rBz C £{pBy)} = pC{£) 


and the result follows. 


□ 


We shall see that the tangential (primal) regularity estimates are stated in terms 
of Banach constants of contingent derivatives of the mapping while the subdifferential 
estimate need dual Banach constants of coderivatives. The following theorem is the first 
indicator that (surprisingly!) the dual estimates can be better. 

Theorem 5.4 (basic inequality for Banach constants). For any homogeneous set-valued 
mapping H : X ^ Y 

c*{n*) > c{n) > c*{n). 

Note that for linear operators we have equality - see Proposition 11.31 In the next section 
we shall see that the equality also holds for convex processes and some other set-valued 
mappings. 

Proof. The right inequality is immediate from the definition. If ClfH) = oo, that is 
= 0, then for any y & Y there is a sequence {xn) C X norm converging to zero 
and such that y G Ti{xn). It is easy to see that in this case 


n*{y*) 


0, if y* 7 ^ 0; 

X*, if y* = 0, 


(5.1) 


that is = {0}, ||(^*)“^||*^ = 0 and hence C*{H*) = oo. 

Let now oo > C{FL) = r > 0. Set A = r~^. Then ||^~^||_ = A so that for any y 
with ||y|| = 1 and any e > 0 there is an x such that ||x|| < A + e and y G 'H(x). Let 
now X* G 'H*{y*), that is {x*,x) — {y*,y) < 0 if y G 'H{x). Take y £ Sy such that 
{y*,y) < (“1 + £)I|2/* II and choose an x G 'H~^{y) with ||x|| < A -|- e. Then 


-(A + e)||x*|| < {x*,x) < {y*,y) < (-l-be)||y* 


that is (A + e)||x*|| > (1 — e)||y*||. As e can be chosen arbitrarily close to zero this implies 
that ||(^*)“^||+ < r~^ and therefore C*{'H*) >r = C{FL). □ 

The following property plays an essential role in future discussions. 

Definition 5.5 (non-singularity). We say that T-L is non-singular if C*(fH) > 0. Otherwise 
we shall call FL singular. 

We conclude the subsection with showing that regularity of a homogeneous mapping 
near the origins implies its global regularity. 

Proposition 5.6. Let X and Y be two Banach spaces, and let F ■. X ^Y he a homoge¬ 
neous set-valued mapping. If F is regular near (0,0), then it is globally regular with the 
same rates. 
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Proof. By the assumption, there are K > 0 and e > 0 such that d{x, F~^{y)) < Kd{y, F{x)) 
if max{||x||, ||y||} < s. Let now {x,y) be an arbitrary point of the graph. Set ||m|| = 
max{||x||, ||y||}, and let y, < ejra. Then 

yd{x,F~^{y)) = d{yx,F~^yy) < d{yy,F{yx)) = yd{yy,F{yx)) 

whence d{x, F~^{y)) < Kd{y,F{x)). □ 

The norms for homogeneous multifunctions were originally introduced first by Rock- 
afellar |157j and Robinson [148j in the context of convex processes (lower norm) and then 
by Ioffe [82] (upper norm for arbitrary homogenous maps) and Borwein [23] (upper norm 
and duality for convex processes -see also [Ml ESI [55|). The dual Banach constant C* 
was also introduced in [H2]- The meaning of the primal constant has undergone some 
evolution since it first appeared in [82]. The C'('R) introduced here is reciprocal to that in 
|84| mainly because the connection of Banach constants with the norms of homogeneous 
mappings makes the present definition more natural. 

5.1.2 Tangent cones and contingent derivatives 

Given a set Q C X and anxGQ. The tangent (or contingent) cone T{Q,x) is the 
collection of h G X with the following property: there are sequences of \ 0 and hk ^ h 
such that x + tkhk G Q for all /c. li F : X then the contingent or graphical derivative 
of F at (x,y) is the set-valued mapping 

X 3 DF{x,y){h) = {vGY ■. {h,v) G r(Graph F,{x,y))}. 

Let now / be a function on X finite at x. The function 

h f~{x-,h) = liminf t~^(f(x + th') — f(x)) 

(t,h')^{0+,h) 

is called the Dini-Hadamard lower directional derivative of / at x. This function is either 
Isc and equal to zero at the origin or identically equal to —oo. The latter of course cannot 
happen if / is Lipschitz near x. 

The connection between the two concepts is very simple: h G T{Q,x) if and only if 
d~{■,Q){x;h) = 0 and a = f~{x;h) if and only if {h,a) G T(epi /, (x,/(x))). 

If T : X ^Y then the contingent derivative of T at x is the set-valued mapping 

X3h^ DF{x; h) = {vGY : {h, v) G r(Graph F, (x, F(x)))}. 

The contingent tangent cone and contingent derivative were introduced by Aubin in 
[5] (see [8] for detailed comments concerning genesis of the concept.) 

5.1.3 Subdifferentials, normal cones and coderivatives. 

From now on, unless the opposite is explicitly said, all spaces are assumed separable. 
Thanks to the separable reduction theorem to be proved in the next subsection such a 
restriction is justifiable in the context of regularity theory. On the other hand, it provides 
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for a substantial economy of efforts, especially in the non-reflexive (or to be precise, non- 
Asplund) case. 

Subdifferential is among the most fundamental concepts in local variational analysis. 
Essential for the infinite dimensional variational analysis are five types of subdifferentials: 
Frechet subdifferential, Dini-Hadamard subdifferential (the two are sometimes called “ele¬ 
mentary subdifferentials”), limiting Frechet subdifferential, G-subdifferential and the gen¬ 
eralized gradient. In Hilbert space there is one more convenient construction, “proximal 
subdifferential”. We shall introduce it in § 7, 

So let / be a function on X which if finite at x. The sets 

dnfix) = {x* G A* : {x*h) < /“(x; h), Vh G X} 

and 

dp fix) = {x* G X* : (x*, h) < fix + h)- fix) + o(||h||)} 

are called respectively the Dini-Hadamard and Frechet subdifferential of / at x. The 
corresponding limiting subdifferential at x (we denote them for a time being dpH and 
dip) is defined as the collection of x* such that there is a sequence (x„, x*) with x„ norm 
converging to x and x* weak*-converging to x*. The essential point in the dehnition of 
the limiting subdifferentials is that only sequential weak*-limits of elements of elementary 
subdifferentials are considered. The limiting Dini-Hadamard subdifferential is basically an 
intermediate product in the definition of the G-subdifferential. Given a set Q C X, the 
G-normal cone to Q at x G Q is 

Xg(5,x) = IJ XdLHdi-,Q)ix). 

A>0 

The G-subdifferential of / at x is defined as follows 

da fix) = {x* : (x*, -1) G iVG(epi /, (x, fix))}. 

The cone NciQ,x) = cl(conv NoiQ,x)) is Clarke’s normal cone to Q at x and the set 

dcfix) = {x*: ix*,-l) e NciQ,x)} 
is the subdifferential or generalized gradient of Clarke. 

Proposition 5.7 (some basic properties of subdifferentials). The following statements 
hold true: 

(a) for any Isc function dp fix) 9 on a dense subset o/dom /; 

(b) the same is true for dp if there is a Frechet differentiable (off the origin) norm in 
X (that is if X is an Asplund space); 

(c) if f is Lipschitz near x, then do fix) / 0 and the set-valued mapping x do fix) 
is compact-valued (see (f) below) and upper semicontinuous; 

(d) if f is continuously (or strictly) differentiable at x, then dfix) = {fix)} for any 
of the mentioned subdifferentials; 

(e) if f is convex, then all mentioned subdifferentials coincide with the subdifferential 
in the sense of convex analysis: dfix) = {x* : /(x -|- /i) — /(x) > (x*, h), V h}; 

(f) f f is Lipschitz near x with Lipschitz constant K, then ||x*|| < K for any x* G 
dfix) and any of the mentioned subdifferentials; 
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(g) if f is Lipschitz near x, then dinfix) = do fix) and dcfix) = cl(conv dcix)); 

(h) if f is Isc and X is an Asplund space, then dip fix) = dcfix) for any x; 

(i) if fix,y) = (pix) + '4!iy), then d fix, y) = d(pix) + dfjiy), where d any of dp, dp^dc 
(but not dc)■ 

Remark 5.8. It should be observed in connection with the proposition that 

• diH has little interest for non-Lipschitz functions: it may be too big to contain any 
useful information about the function. 

• If X is not Asplund, dppfix) may be identically empty even for a very simple 
Lipschitz function (e.g. —||x|| in (^[0,1]). In terminology of the subdifferential calculus 
this means that dp cannot he trusted on non-Asplund spaces. 

We do not need here a formal definition for the concept of a subdifferential trusted on 
a space or a class of spaces (see e.g. [Hj)- Loosely speaking this means that a version 
of the fuzzy variational principle is valid for the subdifferentials of Isc functions on the 
space. Just note that the Frechet subdifferential is trusted on Asplund spaces and only 
on them, Dini-Hadamard subdifferential is trusted on Gateaux smooth spaces and the 
G-subdifferential and the generalized gradient are trusted on all Banach spaces. 

There is one more important property of subdifferentials that has not been mentioned 
in the proposition. This property is called tightness and it characterizes a reasonable 
quality of lower approximation provided by the subdifferential (see m)- It turns out 
that the Dini-Hadamard, Frechet and G-subdifferentials are tight but Glarke’s generalized 
gradient is not. This determines a relatively small role played by generalized gradient in 
the regularity theory. On the other hand, generalized gradient typically is much easier to 
compute and work with. Moreover, convexity of the generalized gradient makes it the only 
subdifferential that can be used in the critical point theory associated with the concept of 
“weak slope”, not considered here. 

We do not need here the general theory of subdifferentials. Just mention in connection 
with the property (h) in Proposition 15.71 that in separable spaces the G-subdifferential 
is a unique subdifferential having a certain collection of properties (including tightness, 
(c), (e), (f) and ’’exact calculus” as defined in the proposition below). It is to be again 
emphasized that we assume all spaces separable. 

Proposition 5.9 (basic calculus rules). Let fix) = /i(x) -|- f 2 ix), where both functions 
are Isc and one of them is Lipschitz near x. Then the following statements are true 

1. Fuzzy variational principle: If f attains a local minimum at x, then there are 
sequences ixin) and ix*^), i = 1,2 such that Xin x, G dHfinixin) and -|-X 2 „|| ^ 
0, 

2. Fuzzy sum rule; if X is Asplund and x* G dp fix), then there are sequences (xm) 

and ix*^), i = l,2 such that Xin x, G dnfinixin) and \\x\^ + - x*|| —)• 0. 

3. Exact sum rule; dcfix) C dcfiix) + dcf 2 ix)- 

Let Q C X and x £ Q. Given a subdifferential d, the set 

NiQ,x) = diqix). 
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always a cone, is called the normal cone to Q at x associated with d. It is an easy matter 
to see that in case of do this definition coincides with the given earlier. For normal cones 
associated with dn and dp we use notation Np and Np. 

Let F : X ^ Y and y G F{x). Given a subdifferential d and normal cone associated 
with d, the set-valued mapping 


y* ^ D*F{x, y){y*) = {x* : -y*) G iV(Graph F, (x, y))} 

is called the coderivative of F at (x,y) associated with d. We use notation D’^, Dp and 
Dq for the coderivatives, associated with the mentioned subdifferentials. 

There is a number of monographs and survey articles in which subdifferentials are 
studied at various levels of generality: |159j (finite dimensional theory), [27l 11301 I144( 
1162] (Asplund spaces), [Ml 1144] (general Banach spaces), [38ll30] (generalized gradients). 
Concerning the sources of the main concepts: Clarke’s sub differential was first to appear 
- it was introduced in Clarke’s 1973 thesis [33] and in printed form first appeared in 
[34] . it is not clear where the Frechet subdifferential first appeared, probably in [20|, the 
Dini-Hadamard subdifferential was introduced by Penot in [141] . the sequential limiting 
Frechet subdifferential for functions on Frechet smooth spaces was introduced by Kruger 
in mimeographed paper [112] in 1981 (not in [116] as stated in e.g. [132] 1130] and many 
other publications- the definition given in [116] is purely topological and does not involve 
sequential weak*-limits) and in printed form appeared in [113] (see [94] for details). The 
G-subdifferential was first defined in [HI] but its definition was later modified in [85] . 

5.2 Separable reduction. 

In this subsection X and Y are general Banach spaces, not necessarily separable. Recall 
that by S{X) we denote the collection of separable subspaces of X. 

Proposition 5.10. Assume that surF(x|y) > r. Then for any Lq C S{X) and M C S{Y) 
there is an L G S{X) containing Lq such that for sufficiently small t > 0 

y Y rt{BY n M) C cl(F(x -|- t{l + 6){Bx n L))), 

if 6 > 0 and the pair {x,y) G (Graph F) n {L x M) is sufficiently close to (x,y). 

Proof. Take an e > 0 to guarantee that the inclusion below holds for x G B{x,e)- 

F{x) n B{y, e) + trBy C F{B{x, t)). (5.2) 

We shall prove that there is a nondecreasing sequence (L„) of separable subspaces of X 
such that: 

y + rtiBy n M) C cl(F(x + t{l + 6){Bx n Ln+l))), (5.3) 

for all <5 > 0 and all (x,y) G (Graph F) n {Ln x M) sufficiently close to (x,y). Then to 
complete the proof, it is sufficient to set L = cl(UL„). 

Assume that we have already Ln for some n. Let {xi,yi) be a dense countable subset 
of the intersection of (Graph F) n (L„ x M) with the neighborhood of (x, y) in which (15.2|] 
is guaranteed, let (vj) be a dense countable subset of By n M, and let (tfc) be a dense 
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countable subset of (0, e). For any i, j, A; = 1, 2,... we find from (j5.2l) an hijk € Bx such 
that Ui + rtkVj G F{xi + tkhijk), and let be the subspace of X spanned by the union of 
Ln and the collection of all hijk- 

If now {x,y) G (Graph F) n x M), t G (0,1), v e By and tk^, Vj^ 

converge respectively to {x,y), t and v, then as C x+t{l+5){Bxr\Mn) 

for sufficiently large m, we conclude that (|5.3I) holds with instead of Fn+i- Gl 

Theorem 5.11 (separable reduction of regularity [96]). Let X and Y be Banach spaces. 
A set-valued mapping F : X ^ Y with closed graph is regular at (x, y) G Graph F if 
and only if for any separable subspace M C Y and any separable subspace Lq C X with 
(x,y) G Lq X M there exists a bigger separable subspace L G S{X) such that the mapping 
FlxM ■ L ^ M whose graph is the intersection of Graph F with L x M is regular at 
(x,y). Moreover, if swcF{x\y) > r, we can choose L G S{X) and M G 5(y) containing 
respectively x and y to make sure that also suiFLxM{x\y) > r. Conversely, if there is 
an r > 0 such that for any separable Mq C Y and Lq d X there are bigger separable 
subspaces M D Mq and L D Lq such that surFixM(x|y) > r, then F is regular at {x,y) 
with surF(x|y) > r. 

Proof. So assume that F is regular at (x,y) with surF(x|y) > r. Then, given Lq and M, 
we can find a closed separable subspace L C X containing Lq such that (|5.3j) holds for any 
5 > 0, any (x, y) G (Graph F) Cl {L x M) sufficiently close to (x, y) and any sufficiently 
small t > 0. 

By the Density theorem we can drop the closure operation, so that FlxM is indeed 
regular near (x,y) with surFixM(x|y) > (1 + 5)“^r. As 6 can be arbitrarily small we get 
the desired estimate for the modulus of surjection of FlxM- 

On the other hand, if F were not regular at (x,y), then we could find a sequence 
{xn,yn) £ Graph F converging to (x,y) such that yn + {tn/n)vn ^ F{B{xn,tn)) for some 
tn < 1/n and Vn G By (respectively yn + tn{r — 6)vn 0 F{B(xn,tn)) for some 6 > 0). 
Clearly this carries over to any closed separable subspace L C X and M C Y containing 
respectively all Xn, all yn and all Vn, so that no such FlxM cannot be regular at (x,y) 
(with the modulus of surjection > r) contrary to the assumption. □ 


5.3 Contingent derivatives and primal regularity estimates 

The following simple proposition establishes connection between slope of / and its lower 
directional derivative. 

Proposition 5.12. For any function f and any x at which f is finite 

|V/|(x) > — inf f~{x-,h). 


Proof. Take an h with ||/i|| = 1. We have 


|V/|(x) 


lim sup 


U{x) - f{x + tu)) + 
t 


> 


lim sup 


/(x) - f{x + tu) 
t 


-f {x-,h) 


as claimed. 


□ 
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The following result is now immediate from the proposition and Theorem 13.21 

Theorem 5.13 (tangential regularity estimate 1). Let {x,y) E Graph F. Assume that 
there are neighborhoods U of x and V of y such that for any y G V the function ify is 
lower semicontinuous U and inf||/j||=i for x gU and y gV . Then 

surF(x|y) > r. (5-4) 

(Of course a similar estimate can be obtained from Theorem 13.101 1 

Theorem 5.14 (tangential regularity estimate 2). Suppose there are a neighborhood U of 
(x, y) and two numbers c > 0 and A E [0,1) such that for any {x, y) G U Ci Graph F 

ex{SY,DF{x,y){cBx)) < A, (5.5) 


then 

smFCxly) > - —(5.6) 
c 

Proof. Take an {x,v) G U Cl Graph F with v ^ y and set z = \\y — v\\~^{y — v). By the 
assumption for any A' > A there is a pair {h,w) with w G DF{x,v){h) such that ||/i|| = c 
and \\z — u;|| < A'. As {h,w) belongs to the contingent cone to the Graph F at (x,u), we 
can find (for sufficiently small t > 0) vectors h{t) and w{t) norm converging to h and w 
respectively and such that v + tw{t) G F{x + th{t)). We have 


||y-('(; + fu;(f))|| = \\y - v - tw\\ + o{t) 

< \\y — V — tz\\ + t\\z — w\\ + o{t) 

< \\y-v\\{l - 11^, * ^ n ) +tx' + o{t), 


(5.7) 


so that 


ip i{x,v)- (h,w)) < lim 


t —^“hO 


\\y - v\ 

\\y - t{v + w{t))\\ - \\y-v\ 
t 


<-(l-A0 


Take a ^ > 0 such that ^(1 + A) < c and consider the ^-norm in A x T, Then 
II(h, t())||^ < max{c, ^(1 + A')} = c (if X' is sufficiently close to A) and we get from (15.Sp 

raL{Ty{{x,v)-, {h,w)) : \\{h,w)\\^ < 1} < -Ty{{x,v)-, {h,w)) < --— 


It remains to refer to Proposition 15.121 and Theorem 13.101 


□ 


Theorem 5.15 (tangential regularity estimate 3). Let X and Y be Banach spaces, and 
let F : X ^ Y be a set-valued mapping with locally closed graph. Let finally y G F{x). 
Then 

surF{x\y) > lirn inf{C'(iAF(x, ?/)) : (x,y) E (Graph F)^B{{x,y),e)}, (5.8) 

or equivalently, 

regF(x|y) < limsup{||(T)F(x,y))“^||_ : y E F(x), ||x —x||<e, ||y —y||<e} 

£->0 

= lim I sup inf{||/i|| : v G DF{x,y)(h)} : 
lhll=i 

(x,y) E (Graph F)f]B{{x,y),£)}. 
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Proof. We first note that DF{x,v){Bx) is a star-shaped set as it contains zero and z G 
DF{x, v){h) implies that Xz G DF{x, v){Xh) for A > 0. On the other hand, by Proposition 
15.21 C(DF(x. v)) > r > 0 means that rBy C DF{x,v){Bx). It follows that By C 
DF{x,v){r~^Bx). If this is true for all {x,v) G Graph F close to {x,y), this in turn 
means that the condition of Theorem l5.14l is satisfied with c = r~^ and A = 1, whence the 
theorem. □ 


Remark 5.16. In fact the last two theorems are equivalent. Indeed, let the conditions 
of Theorem 15.141 be satisfied. Then (1 — X)By C DF{x,v){cBx) for all {x,v) G Graph F 
close to {x, y) and setting r = c“^(l — A) we get rBy C DF{x, v){Bx) for the same (x, v). 

It follows from the proofs that the estimate provided by Theorem 15.131 is never worse 
than the estimates given by the other two theorems. But it can actually be strictly better 
(unless both spaces are finite dimensional). Informally, this is easy to understand: the 
quality of approximation provided by the contingent derivative for a map into an infinite 
dimensional spaces maybe much lower than for a real-valued function. The following 
example illustrates the phenomenon. 

Example 5.17. Let X = Y he a separable Hilbert space, and let (ei, 62 ,...) an orthonor¬ 
mal basis in X. Consider the following mapping from [0,1] into X: 



0 , 

2-(-+2)e„, 


if t G {0,1} 
if t = 2-", 


and r]{-) is linear on every segment [2 ”], n = 0,1,.... Define a mapping from 

the unit ball of £2 into £2 by 

F{x) =x-r]{\\x\\). 

It is an easy matter to see that x i-)- t?(||x||) is (\/5/4)-Lipschitz, hence by Milyutin’s 
perturbation theorem F is open near the origin with the rate of surjection at least 1 — 
(V5/4). 

Let us look what we get applying both statements of the theorem for the mapping. If 
\\h\\ = 1 and t G (2“("-+^), 2“"-], then F{th) = t/i- (t/2)(en - en+i) - 2“("-+^)(2en+i -€„)), 
and it is easy to see that for no sequence (tk) converging to zero t'^^F{t]f) converge. Hence 
the tangent cone to the graph of F at zero consists of a single point (0,0) and the first 
statement gives surE(O) > 0 - a trivial conclusion. 

Now take an x with ||x|| < 1 and a y ^ F{x). We have 


\\F{x + th)-y\\ = 

< 
< 


||x + th- rj{\\x + th\\) - y\\ 

\\x + th - r]i\\x\\) - y\\ + ||7?(||xt/i||) - ??(||x||)| 
\\F{x) + th - y\\ + {3/4:)t\\h\\. 


Taking h = {y — F{x))/\\y — E(x)||, we get 


(x; ft) < ((1 - IXM - !/l| - l|F(x) - !/|l) + ^ = -L^ 

which gives surT’(x) > 1 — (\/5/4) for all x with ||x|| < 1. 
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A tangential regularity estimate, similar to but somewhat weaker than that in Theorem 
15.141 was first obtained by Aubin in [6] (see also [8]) under the same assumptions. The very 
estimate (|5.6I) was obtained in [M]. Theorem 15.151 was proved by Dontchev-Quincampoix- 
Zlateva in [53j . Theorem 15.131 seems to have been state for the first time in [32]. Example 
15.171 has also been borrowed from that paper. 

5.4 Dual regularity estimates. 

This is the part of the local regularity theory that attracted main attention in the 80s 
and 90s. The role of coderivatives was in the center of the studies. Further developments, 
however, that followed the discovery of the role of slope open gates for potentially stronger 
(and often easier to apply) results involving subdifferentials of the functions ipy and ijjy. 

5.4.1 Neighborhood estimates 

There is a simple connection between slopes and norms of elements of subdifferentials. 

Proposition 5.18 (slopes and subdifferentials). Let f be Isc, and let an open set U have 
nonempty intersection with dom /. Then for any subdifferential d 

inf d{0,df{x)) < inf |V/|(x). 

xGU xGU 

On the other hand, ||x*|| > |V/|(x) if x* G dpfix). 

Combining this with Theorems 13.101 and 13.121 we get 

Theorem 5.19 (subdifferential regularity estimate 1). Let X and Y be Banach spaces, 
let F : X have a locally closed graph, and let d be a subdifferential trusted on a class 
of Banach spaces containing both X and Y. Then for any {x, y) G Graph F and any .^ > 0 

surF(T|y) > liminf __ inf{||x*|| +^”^11^*11 : {x*,y*) G dipy{x,v)}. (5.9) 

Gr^hF 
y^y, y^v 


and 

surE(x|y) > liminf d{0,^^f!y{x)). (5.10) 

y^F(x) 

Theorem 5.20 (subdifferential regularity estimate 2). Let {x,y) G Graph F. Assume 
that there are neighborhoods U of x and V of y such that for any y G V the function tpy 
is lower semicontinuous and ||x*|| > r if x* G du'fyix) for all x G U and y gV . Then 

surE(x|y) > r. (5-11) 

The obvious inequality ||x*|| > —f~{x;h) if x* G dnfix) and ||/i|| = 1 shows that the 
estimate provided by the last theorem cannot he worse that the estimate of Theorem \5.13[ 
Our next purpose is to derive coderivative estimates for regularity rates. 
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Theorem 5.21 (coderivative regularity estimate 1). Let F : X ^ Y be a set-valued 
mapping with locally closed graph containing {x,y). Then 

suiFCxly) > \im inf{C*(DLF(X, y)) : y & F(x), llx —x||<e, \\y — y\\<e} 

6->-0 

= liminf{||x*|| : x* G D*jjF{x,y){y*), ||y*|| = 1, 

£^•0 

{x,y) G (Graph F)f]B{{x,y),£)}, 

or equivalently, 

TegF{x\y) =lipF-^{y\x) < lini sup{||i:>|i-F“^(x, y)||+: 

£—>•0 

{x,y) G (Graph F) f] ^((®, j/), e)} 

= lirnsup{||y*|| : x* G D*HF{x,y){y*), ||x*|| = 1, 

£^0 

{x,y) G (Graph F) fl ^((®, j/), e)}- 

To furnish the proof we can use either any of the estimates of the preceding theorem or 
apply directly the slope-based results of Theorems 13.lUI and 13.121 via (|5.18l) . We choose the 
second option as it actually leads to a shorter proof. The first approach requires to work 
with weak* neighborhoods to estimate subdifferential of a sum of functions (that inevitably 
appears in the course of calculation) which makes estimating norms of subgradients difficult 
(if possible at all). 

Proof. We only need to show that, given (x, w) G Graph F, for any neighborhoods U G X 
and V gY oi X and y 

inf{||x*|| : X* G D*F{u,v){y*), {u,v) G Graph F n (G x V), ||y*|| = 1} < m. 

if |Vg(^y|(x, rc) < m for small f,. Then the theorem is immediate from Theorem 13.101 in 
view of Proposition 15.181 

So let \X^ipy\{x,w) < m. Take an m' < m but still greater than \X^(py\{x,v) and set 

f{u,v) = ipy{u,v) + m'max{\\u — x\\,f,\\v — w\\} 

= ||u - y\\ + ZGraph Fiu,v) + Tu'max{\\u - x||,.f||u - rcll}. 

Then / attains a local minimum at {x,w). 

We thus can apply Proposition 15.91 given a <5 > 0, there are Vi, i = 0, 1,2, Ui, i = 1,2 
with (tti,ui) G Graph F and Vq G 9|| • ||(y — uq), (u)^,u*) G A^(Graph F, (ui,ui)) and 
(^ 2 )^ 2 ) with ||tt 2 || +^”^ 1^211 — such that 

||uj — tell < 6, ||ui — x|| < 6, ||n)^ -|- U 2 II < 6, ||uq + u* + U 2 II < 6. 

Take 6 < \\y — rc||, (1 -|- 25)m' < m and ^ so small that < 5. Then y / uq, so that 
||uq|| = 1, IIX 2 II < m' and Hu^H < d- We thus have ||x^|| < m'+5 < m and |||u*|| —1| < 1+25. 
It remains to set y* = uJ/Hujdl, x* = xj/||uj|| to complete the proof. □ 

Theorem 5.22 (coderivative regularity estimate 2). If in addition to the assumptions of 
Theorem \5.21\ both X and Y are Asplund spaces, then 

surF(x|y) = lim inf{C'*(F^F(x, y)) : y G F(x), ||x —x||<e, \\y — y\\<£} 

£^0 

= liminf{||x*|| : x* G D*pF{x,y){y*), ||y*|| = 1, 

£^0 

(x,y) G (Graph F)f]B{{x,y),£)}, 
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or equivalently, 

regF(x|y) = lipF-i(j/|x) = lim sup{||i:>^F“^(x, y)||+: 

£->0 

{x,y) G (Graph F){^B{{x,y),e)} 

= liinsup{||y*|| : x* G D*pF{x,y){y*), ||x*|| = 1, 

£^0 

{x,y) G (Graph F) f] S((^, y), e)}- 

Proof. If the spaces are Asplund, then the same arguments as in the proof of the preceding 
theorem lead to the same conclusion with D*^ replaced by D*p. So we have to show that 
the opposite inequality holds. This however is an elementary consequence of the definition. 
Indeed, fix certain {x,y) G Graph F close to (x, y) and let 

m = inf{||x*|| : x* G D*pF{x,y){y*), ||y*|| = 1}. 

If surL)^F(x|y)= 0 or DpF{x,y){y*) = 0 (in which case m = oo by the general conven¬ 
tion), the inequality is trivial. So we can take a positive r < surF(x|y) in which case we 
may assume that B{y,rt) C F{B{x,t)) for small t and y close to y, and suppose that 
m < oo. Take a x* G Dp{x,y){y*) with ||y*|| = 1 and ||x*|| < m -|- <5 for some J > 0. 
Then {x*,h) — {y*,v) < o(||/i|| -|- ||x||) whenever {x + h,y + v) G Graph F. Now take 
v{t) G B{y,rt) such that {y*,v(t)) < —(1 — f^)||x(t)|| and an h{t) with ||/i(t)|| < t such 
that (x -|- th{t),y + v{t)) G Graph F. Then 

-t||x*|| + (1 - f)rt < {x*,h{t)) - {y*,v{t)) < o(||/i(t)|| -h \\v{t)\\) = o{t) 

which implies that r < m and the result follows. □ 

Remark 5.23. Note that the just given proof (that the inequality < holds) works in any 
space, not necessarily Asplund. In other words, the part of the theorem that incorporates 
essential properties of the space (that is that the Frechet subdifferential is trusted) is 
contained in Theorem 15.211 

Gomparing the last theorem with Example 15.171 we conclude that in Asplund spaces 
the coderivative estimate using Frechet coderivative can be strictly better than the tan¬ 
gential estimate provided by Theorem 15.151 What about connection of the estimates from 
Theorems 15.151 and I5.21P 

Proposition 5.24 (DH-coderivative vs. tangential criterion). The regularity estimate 
involving Dini-Hadamard eoderivative (Theorem \5.21\) is never worse than tangential es¬ 
timate provided by Theorem \5.15[ 

Proof. Indeed, by definition D*^F{x,y) = {DF{x,y))* and we only need to recall that 
C*{D*pF{x,y)) > C{DF{x,y)) for any (x,y) G Graph F by Theorem 15.41 □ 

Theorem 15.211 was proved in [M] for subdifferentials satisfying a bit stronger require¬ 
ments than the subdifferential of Dini-Hadamard. However a minor change in the proof 
allows to extend it to all subdifferentials trusted on the given Banach space (see e.g. [88| IM] 
also for a proof) , in particular to the DH-subdifferential on any Gateaux smooth space. 
Likewise, Theorem 15. 22 1 was proved in |114j . in a somewhat different form and in terms of 
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e-Frechet subdifferential on Frechet smooth spaces. And again, a minor change is needed 
to extend the proof to standard Frechet subdifferentials. Theorem 15.221 as stated was 
proved in |133j (see also |130j for a proof, for all Asplund spaces, not necessarily separa¬ 
ble). This extension can be viewed as a consequence of the Frechet smooth spaces version 
of the theorem and the separable reduction theorem of Fabian-Zhivkov [B3j (and actually 
was proved that way). Proposition 15.241 seems to have never been mentioned earlier. It 
sounds rather surprising with all its simplicity. It would be interesting to find an example 
with a Dini-Hadamard coderivative estimate strictly better, than the tangential estimate 
(or to prove that the estimates are equal). It is still unclear whether strict inequality is 
possible. The general consideration (the dual object cannot contain more information that 
its original predecessor) suggests that this is rather unlikely. But no proof is available for 
the moment. It should be mentioned however that the tangential estimate is valid in all 
Banach spaces while the Dini-Hadamard coderivative makes sense basically in Gateaux 
smooth spaces. 

5.4.2 Perfect regularity and linear perturbations 

The main inconvenience of the regularity criteria that have been just established, no matter 
primal or dual, comes from the necessity to scan an entire neighborhood of the point of 
interest. Below we define what can be viewed as an ideal situation. 

Definition 5.25. We shall say that F is perfectly regular at {x,y) G Graph F if 

surF(T|y) = C*{D^F{x,y)) = min{||x*|| : x* G Df;F{x,y){y*), ||y*|| = 1}. (5.12) 

Later we shall come across some classes of perfectly regular mappings and meanwhile 
consider an important class of additive linear perturbations of maps. 

Definition 5.26. Given a set-valued mapping F : X ^ Y and an {x,y) G Graph F. 
The radius of regularity of T at {x, y) is the lower bound of norms of linear continuous 
operators A : X ^ Y such that sur(F + A){x,y + Ax) = 0. We shall denote it vadF{x\y). 

By Milyutin’s theorem surT(iE|y) < TadF(x\y). It turns out that for perfectly regular 
mappings the equality holds. To show this we need the following proposition, not very 
difficult to prove. 

Proposition 5.27. Let X and Y be normed spaces, let F : X ^Y be set-valued mapping 
with closed graph, and let A G C{X,Y). Assume that F is regular at {x,y) G Graph Fand 
set G = F + A (that is G{x) = F{x) + Ax). Then 

D*GG{x\y + Ax) = DqF{x, y) -b A* 

Note that the equality is elementary in case of Dini-Hadamard or Frechet subdifferentials. 

Theorem 5.28 (perfect regularity and radius formula). Assume that X and Y are Banach 
spaces, F : X ^ Y, {x,y) G Graph F and F + A is perfeetly regular at ix,y + Ax) for 
any A G F(X, Y) of rank 1. Then 

surF(x|y) = vadF{x\y). (5.13) 
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Moreover, for any e > 0 there is a linear operator of rank one such that H^ell < 
suTF{x\y) + e and sur(F + A)(x,y + Ax)) = 0. 

In the sequel we call (I5.13|] the radius formula. 

Proof. Set r = surF(x|y). The theorem is obviously valid if r = 0. So we assume that 
r > 0. Take an e > 0 and find a y* and an x* G DQF{x,y){y*) such that ||y*|| = 1, ||x*|| < 
(1 + e)r. Let further x^ G X and y^ gY satisfy 

Ikell = llyell = 1, > i'i- - £)\\xl\\- (2/^ > (1 “ s)- (5-14) 

We use these four vectors to dehne an operator A,. : X ^ Y as follows: 


AeX = 


{x*e,x) 


Then ||^e|| < 


1 + e 


1 — e 


r and 


Aly* = 


{y*,ye) * 
{yl,ye)''^ 


In particular we see that — x* = A*y*. Combining this with Proposition 15.271 we get 0 = 
X* — A*y* G Dq{F + A){x,y + Ax){y*) and therefore by the prefect regularity assumption, 
sur(T + A){x\y + Ax) = 0, that is radF(x,y) < ||Ae|| —)• r as e ^ 0. □ 


Let S{y, A) be the set of solutions of the inclusion 


y G F{x) + Ax, (5.15) 

where A G C(X,Y). Let x be a nominal solution of ()5.15l) with y = y, A = A. The 
question we are going to consider concerns Lipschitz stability of S with respect to small 
variations of both y and A around the nominal value (y, A) and their effect on regularity 
rates. 

In other words, we are interested in finding lipS'((y, ^)|x). By the equivalence theorem, 
this is the same as hnding the modulus of surjection of the mapping = 5"“^ at (x, (y, ^)). 
Clearly 

<i>(x) = {(y, G y X C{X, Y): yG F{x) + ^(x)}. 

We shall consider Y x C{X,Y) with the norm ||(y,^)|| = i^(||y||, ||^||), where is a 
norm in IR^. The dual norm is z^*(||y*||. Ill’ll), where i G {C{X x T))* and v* is the norm 
in dual to u: r'*(u) = sup{a^ + fdr] : z/(q;,/ 3) < 1}. As to the space dual to C{X,Y), 
we only need the simplest elements of the space, rank one tensors y* 0 x whose action on 
A G C{X, Y) is defined by (y* (g) x, A) = {A*y*,x) and whose norm is ||y* ® x|| = ||y*|| ||x||. 
The following theorem gives an answer to the question. 

Theorem 5.29 (|95]). Let X and Y be Banaeh spaces, and let F : X ^Y be a set-valued 
mapping with closed graph. Let (x,y) G Graph F and let A G C{X,Y) be given. Then 

lipS’((y, A)|x) < u*{l, ||x||)reg(T + A)(x|y). 
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To prove the theorem we only need to show that 


sur^>(x|(y,^)) > ^*(;^^||^||) Sur(T +A)(3;|y). (5.16) 

So the proof (involving some calculation) can be obtained either from Theorem 14.51 or 
directly from the general regularity criterion of Theorem 13.11 

The concepts of perfect regularity and radius of regularity were introduced respectively 
in [102) and [52]. Theorem 15.281 is a new result. A finite dimensional version of Theorem 
15.291 for a class of F with convex graph was proved in |30| . We shall discuss the prob¬ 
lems considered in this subsection in more details for finite dimensional mappings later in 
Section 8. 

6 Finite dimensional theory. 

In this section we concentrate on characterizations of regularity, subregularity and transver- 
sality for set-valued mappings between finite dimensional spaces. There are several basic 
differences that make the finite dimensional case especially rich. The first is that the 
subdifferential calculus is much more efficient. In addition certain properties different in 
the general case appear to be identical in IR^. In particular, for a lower semicontinuous 
function the Dini-Hadamard subdifferential and the Frechet subdifferential are identical. 
Therefore the usual notation used in the literature for this common subdifferential is d 
rather than dn or dp- Likewise, as the limiting Frechet and the G-subdifferentials are also 
equal, it is convenient to speak simply about limiting subdifferential and denote it simply 
by d. 

The second circumstance to be mentioned is the abundance of some special classes 
of objects of practical importance and definite theoretical interest. Enough to mention 
polyhedral and semi-algebraic sets and mappings (to be considered in the second part 
of the paper), semi-smooth functions, prox-regular functions and sets etc.. We do not 
discuss some interesting and important subjects, e.g. Rummer’s inverse function theorem 
and its applications (well presented in the literature: much on the subjects can be found 
in [55l I109j l or semismooth mappings (see e.g. [68]L 

6.1 Regularity. 

Theorem 6.1. A set-valued mapping F : IR^ ^ IRF with locally closed graph is perfectly 
regular near any point of its graph. 

Proof. This is immediate from Theorem 15.221 □ 

Theorem 6.2. The radius formula holds at any point of the graph of a set-valued mapping 
F : ]R^ ^ with locally closed graph. Moreover, the lower bound in the definition of 
the radius of regularity is attained at a linear operator A : IR^ —>■ tR™ of rank one. 

Proof. This is immediate from Theorem 15.281 □ 
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Theorem 6.3. Let F : ^ JR^ be a set-valued mapping with loeally closed graph, and 

let (x, y) G Graph F. Then 

surF(x|y) = lirn inf{C(DF(x, y)) : (x, y) € (Graph F) P| i?((x, y), e)}. (6.1) 

Proof. In view of Theorem 15.151 it is enough to verify that C{DF{x,y)) > r if B{y,tr) C 
F{B{x,t)) for all sufficiently small t (of course for (x,y) G Graph F). So take a u G iR™' 
with ||u|| < r and let h{t) be such that ||/i(t)|| < 1 and y + tx G F{x + th{t)). If now h 
is any limiting point of h{t) as t —)• 0, then v G DF{x,y){h). This shows that rBjum c 
DF{x,y){B][in). □ 

Similarly, inequality can be replaced by equality in the estimate of Lipschitz stability 
of solutions of the inclusion 

y G F{x) + Ax (6.2) 

with both y and A viewed as perturbations (cf. Theorem 15.291) . But first we have to do 
some preliminary job. As in 5.4.2 we denote by S{y,A) the set of solutions of ()6.2p and 
by the inverse mapping 


4>(x) = {(y, A) : y e F{x) + Ax}. 

Lemma 6.4. For any x G X, let E{x) : Y x C{X,Y) Y be the linear operator defined 
by E{y,A) = y — Ax. Then, under the assumptions of Theorem \5.29\ 

u{l, ||x||)C'(F(x) o D^{x, (y, A)) < C{D{E + A){x, y)), 

whenever y G F{x) + Ax. 

Proof. By definition {h,v,A)eXxYx C{X, Y) belongs to r(Graph (x, y, A)) if there 
are sequences {hn) —>■ h, (vn) v, (A„) —)• A and (tn) —>■ +0 such that 

y T (A T tnAyfi{x T tri^n) ^ E{x T tn^n) 


or 

y T I'niTn A^X T t^^An^n) ^ i.E T A)(x T tjih^ifi 
As tri||An/in|| —>■ 0, it follows that 

r(Graph 4*, (x, y, A)) = {{h, v, A) : {h,v — Ax) G T(Graph {F + A), (x, y))} 


which amounts to 

E{x) o D^{x, (y, A)) = D{F + A)(x, y). (6.3) 

We have (Corollary 15.3 p C{E{x)) ■ C{D^{x, (y, A))) < C{D{F + A)(x,y)). On the other 
hand E{x)*{y*) = (y*, —y* ® x) and therefore (Proposition 11.31) 

C{E{x)) = inf ||F(x)*y*|| = ||x||). 

Ily 11=1 

This completes the proof of the lemma. □ 
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Theorem 6.5 (linear perturbations - finite dimensional case). Let F : ^ IR^ be 

a set-valued mapping with locally closed graph, and let y G F{x). We consider IR^ x 
C{EF,]R^) with the norm J^(||y||, ||^||), where u is a certain norm in IR?. Then, given an 
A G C{1R^,1R^), we have 

lip5((y, A)\x) = v*{l, ||x||)reg(F + A){x\y). 

Proof. Immediate from the lemma and Theorem 15.291 □ 

Finally, we have to mention that a continuous single-valued mapping f : JR^ —?■ IR™ 
can be strongly regular only ifm = n. This is a simple consequence of Brouwer’s invariance 
of domain theorem (see e.g. M). 

Theorem 16.11 was announced by Mordukhovich in a somewhat different form |128j 
(see also |129] 1. But the lower estimate for the modulus of surjection (which is actually 
the major step in the proof) is immediate from Ioffe [83]. Theorem 16.21 was proved by 
Dontchev-Lewis-Rockafellar in |52| and Theorem 16.31 by Dontchev-Quincampoix-Zlateva 
|53j . Theorem 16.51 is a slightly generalized version of already mentioned result of Canovas, 
Gomez and Senent-Parra I3Q|. 

6.2 Subregularity and error bounds. 

Let / be an extended-real-valued Isc function on JR^. We can associate with this function 
the epigraphic map 

Epif{x) = {a G IR’ a > f{x)} 

Subregularity of such a mapping at a point {x, a) (if a = f{x) is finite) means that there 
is a iL > 0 such that 

d{x, [f < a]) < K{F{x) — «)■*■ 

for all x close to x. The constant K in this case is usually called a local error bound for / 
at X. We shall say more about error bounds in the second part of the paper. 

To characterize the subregularity property of epigraphic maps we define outer limiting 
subdifferential of / at x as follows: 

f{x) = { lim Xfc : 3 Xfc x, f{xk) > /(x), xl G df{xk)}. 

AC—>-00 / 

Theorem 6.6 (error bounds in iR"). Let f be a lower semicontinuous function on iR” 
that is finite at x. Then K > 0 is a local error bound of f atx if either of the following 
two equivalent conditions is satisfied: 

(a) iL • lim inf{|V/|(x) : llx — x|| < e, /(x) </(x) </(x)-|-iLe} > 1; 

£->■0 

(b) K ■ d{0,d^ f{x)) > 1. 

Thus, if F : ^ IR™ has locally closed graph and (x,y) G Graph F, then 

subregF(x|y) < [inf{||x*|| : x* G d^d{y,Ff))(x)}]~^. 
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Proof. If (a) holds, then K is a, local error bound by Lemma l7. II to be proved in the next 
section. To prove that (a)^(b), let x* G d^f{x). This means that there are sequences 
(xfc) and (x^) such that x^ — x, f{xk) > fix), x\ —>■ x* and x\ G dfixk). Choose e/c 0 
such that llxfc —x|| < £k and /(x^) —/(x) < Ksk- If (a) holds, then iLdim inf |V/|(xfc) > 1. 
But ||x^|| > V/|(xfc) (Proposition [5T^ and (b) follows. 

The opposite implication (b)^(a) also follows from Proposition 15.181 Indeed, denote 
by r the value of the limit in the left side of (a), take an e > 0 and let x satisfy the bracketed 
inequalities in (a) along with |V/|(x) < r + e. This means that / + (r + e)|| • —x|| Applying 
the fuzzy variational principle, we shall find u and u* G dpiu) such that ||m — x|| < e, 
fiu) < f{x)+e/K and ||ri*|| < r + 2e. This means that there is a sequence of pairs 
(xfc,x^) such that Xk -^f x, x^ G dpfixk) and hmsup||x^|| < r. As (b) holds, it follows 
that Kr >1. □ 


Conditions (a) and (b) are not necessary for K to be an error bound of / at x. 
Example 6.7. Consider 


fix) = 1 if a: < 0; 

^ \ x + x^sinx if X > 0. 

It is an easy matter to see that any iL > 1 is an error bound for / at zero but at the 
same time 0 G 9^/(0). 

Such a pathological situation, however, does not occur if the function is ’’not too 
nonconvex” near x. 


Proposition 6.8. Let f be a lower semicontinuous function on IR" finite at x. Suppose 
there are a 9 > 0 and a function r(t) = o{t)such that 


f{u) - fix) > {x*,u- x) - ri\\u - x||) 

for all X, u of a neighborhood ofx, provided fix) < fix) < fix) + 9 and x* G cl(x). If 
under these conditions, K > 0 is an error bound of f at x, then the conditions (a) and 
(b) of Theorem, 1 6 . 61 hold. 

Proof. Assume the contrary. Then there are e > 0 and a sequence of pairs (x^, x^) G 
dfixk)) such that Xk x, fixk) > fix) and ||x^|| < K~^ — e. For any k take an 
Tfc G [/ < fix)] closest to Xfc. Then Xk fix) and by the assumption 

fixk) - fixk) > {x*k,Xk - Xk) - r(||xfc - Xfcll). 

As ||xfc — Xfcll —>■ 0, for large k we have r(||xfc — Xfc||) < (e/2)||xfc — Xk\\. For such k 


fixk) < fixk) + (II4II + (^/2))lkfc - Xk 


It follows that 


dixk, [f < fix)]) = \\xk - XkW > 


1411 +(e/2) 


fixk), 


that is iK ^ — (e/2))d(xfc, [/ < fix)]) > /(xfc) contrary to the assumption. 


□ 
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The last result of this subsection contains inhnitesimal characterization of strong sub¬ 
regularity. 

Theorem 6.9 (characterization of subregularity and strong subregularity). Let again 
F : iR” ^ IR^ have locally closed graph and {x,y) G Graph F. Then 

• F is subregular at {x,y) G Graph F if d{0,d^'tpy{x)) > 0; 

• a necessary and sufficient condition for F to he strongly subregular at {x, y) is that 
DF{x,y) is nonsingular, that is C*{DF{x,y)) > 0. 

Proof. The first statement is a consequence of Theorem 16.61 To prove the second, assume 
first that F is strongly subregular at {x,y), that is there is a iG > 0 such that ||a: — x|| < 
Kd{y,F{x)) for x sufficiently close to x. If DF{x,y) were singular, Proposition 15.21 would 
guarantee the existence of sequences (hfc) C iR” and (vk) C IR"^ such that ||/ifc|| = 1, 
||ufc|| —>• 0 and y + t^Vk G F{x + t^h^j, so that for large k 

||T tkhk -x\\ =tk> Ktk\\vk\\ = K\\y + tkVk -y\\ > Kd{y,F{x + tkhk)), 

contrary to our assumption. 

Let now DF{x,y) be nonsingular. This means that ||u|| > k > 0 whenever v G 
DF{x,y){h) with \\h\\ = 1. It immediately follows that, say, \\y — y\\ > {k/2)\\x — x|| 
whenever y G F{x) and x is sufficiently close to x which is strong subregularity of F at 
(x,y). □ 

Literature on local error bounds in IR” is very rich - see e.g. the monograph by 
Facchinei and Pang [65] that summarizes developments prior to 2003. Theorem 16.61 and 
Proposition 16. SI seem to be new as stated but they are closely connected with the results of 
loffe-Outrata |100j and Meng and Yang [127] among others. The second part of Theorem 
I6.9l as well as other results relating to strong subregularity and applications can be found in 
|55j and |109] . (In |109j the authors use the term ’’locally upper Lipschitz” property. The 
term ’’strong subregularity” seem to have appeared later.) Another sufficient condition 
for subregularity was suggested by Gfrerer m- It would be interesting to understand 
how the two are connected. It should also be noted that no characterization for strong 
subregularity in terms of coderivatives is so far known. 

6.3 Trans versality. 

We have mentioned already that the classical concepts of transversality and regularity 
are closely connected. To see how the concept of transversality can be interpreted in the 
context of variational analysis, we first consider the case of two intersecting manifolds in 
a Banach space. 

Let Y be a Banach space and Mi and M 2 smooth manifolds in X, both containing 
some X. As was mentioned in Subsection 1.4, the manifolds are transversal at x if either 
X 0 Ml n M 2 or the sum of the tangent subspaces to the manifolds at x is the whole of X: 
TxMi + TxM 2 = X. The following simple lemma is the key to interpret this in regularity 
terms in a way suitable for extensions to the settings of variational analysis. 

Lemma 6.10. Let Li and L 2 be closed subspaces of a Banach space X such that L 1 + L 2 = 
X. Then for any u,v G X there is h G X such that u + h G Li and v + h G L 2 . 
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Proof. If M + /i G -Li, then h € —u + Li, so if the statement were wrong, we would have 
{v — u + Li) n L2 = 0. In this case there is a nonzero x* separating v — u + Li and L2, 
that is such that {x*,x) = 0 for all x G L2 and {x*,v — u + x) > 0 for all x G Li. But 
this means that x* vanishes on Li as well. In other words, both Li and L 2 belong to the 
annihilator of x* and so their sum cannot be the whole of X. □ 


The lemma effectively says that the linear mapping (u, v,h) 1 —>■ (u + h,v + h) maps 
Li X L 2 X X onto X X X, that is this mapping is regular. If x G Mi n M 2 , then applying 
the density theorem (Theorem 13.5p . we get as an immediate corollary that the set-valued 
mapping <I>(x) = (Mi — x) x (M 2 — x) from X into X x X is regular at zero. This justifies 
the following definition 

Definition 6.11. Let Si <Z X, ( = 1,..., /c be closed subsets of X. We say that Si are 
transversal at x G X if either x 0 CiSi or x G nS'j and the set-valued mapping 

X !->■ F{x) = {Si — x) X • • • X {Sk — x) 

from X into X^ is regular near (x, 0,..., 0). In the latter case, we also say that Si have 
transversal intersection at x. 


This definition may look strange at the first glance but the following characterization 
theorem shows that it is fairly natural. 

Theorem 6.12. Let Si C i = 1,... ,k and x G nS’j. Then the following statements 
are equivalent 

(a) Si are transversal atx; 

(b) x* G N{Si,x), Xi H - h x^ = 0 ^ Xi = ... = x^ = 0; 

k 

(c) d{x, n(s. — Xi) < K max(i(x, Si — xf) if Xi are close to zero and x is close to x. 

i=l 


Proof. It is not a difficult matter to compute the limiting coderivative of F: if (xi,..., x^) G 
F{x), then 

k 


D*F{x\{xi ,... ,Xfc)) 


^x*, if x* G N{Si, Xi +x); 
i=l 

0, otherwise. 


Combining this with Theorem 16.II we prove equivalence (a) and (b). 

Furthermore, F“^(xi,..., Xk) = {Si — xi) n • • • n {Sk — Xk), whence equivalence of (a) 
and (c). □ 


Note that implicit in (c) is the statement that the intersection of Si — Xj is nonempty 
if Xi are sufficiently small. In case of two sets one more convenient characterization of 
transversality is available. 


Corollary 6.13. Two sets Si and S 2 both containing x are transversal at x if and only 
if the set-valued mapping <I) : SF x : 


$(X1,X2) 


xi - X2, if Xi G Si] 
0, otherwise 
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is regular near (x,x, 0 ). 

Proof. We have T(Graph <!>, ((xi, X 2 ), xi —X 2 ) = {(/ii,/i 2 )'i’) : hiGT{Si,Xi), x = /ii —/ 12 }, 
so that 

L)*^>((x,x),0)(x*) = {{x\,x* 2 ) : X* £ N{Si,x) + x*}. 

If we consider the max-norm ||(xi,X 2 )|| = max{||xi||, ||x 2 ||||} in ]R^ x then it follows 
from Theorem ED that is regular near (x, x, 0 ) if and only if 

inf{||x^ — x*|| + ||x 2 + x*|| : x* G N{Si, x„), ||x*|| = 1 } > 0 . 

This amounts to iV(S'i,x) n (—iV(S' 2 ,x)) = {0}, which is exactly the property in the part 
(b) of the theorem. □ 

In view of the eqnivalence between (a) and (c) in Theorem l6.12l the following definition 
looks now very natural. 

Definition 6.14 (subtransversality). We shall say that closed sets Si,...,Sk are sub¬ 
transversal atx£ nSi if there is a iT > 0 such that for any x close to x 

k k 

d{x,f]Si)<KY,d{x,Si). 

i=l i=l 

In a similar way, it is easy to see that subtrasversality is equivalent to subregularity of 
the same mapping F and to get a sufficient subtransversalty condition from Theorem 16.61 
In the next section we shall be able to see the key role subtransversality plays in some 
problems of optimization and subdifferential calculus. 

We conclude with a brief discussion of transversality of a mapping and a set. 

Theorem 6.15. Let F : ^ fR™ have locally closed graph, and let S C IR™ be closed. 

Assume that y £ F(x) n S. Then the following statements are equivalent: 

(a) the set-valued mapping ^ : (x,y) 1 —)• {F{x)—y)x{S—y) is regular near {{x,y), {0,0)); 

(b) the sets Graph F and x S have transversal intersection near {x,y); 

(c) 0 £ D*F{x,y){y*) k y* G N{S,y) ^ y* = 0. 

Proof. Equivalence of (b) and (c) follows from Theorem l6.12[ To prove that (a) and (b) are 
equivalent, set 'k(x, y) = (Graph F — {x, y)) x (fR" xS — {x, y)). If ((^, /r), ( 77 , n)) £ 'k(x, y), 
then {pL,^) £ ^{u,y) with u = ^ + x. Gonversely, if {pL,v) £ <h(u, y), then {u,y + y) £ 
Graph F and {w,^ + y) £ ]R^ x S for any w £ iR". Then for any x, we have, setting 
f = u-x,r] = w-x, that ((C,y), {v,i^)) £ ^{x,y). 

(b) =► (a). If (b) holds, then is regular near ((x, y), ((0, 0), (0, 0))). So let {{f, p), (y, v)) £ 
\I'(x,y) with (x,y) sufficiently close to (x,y) and f,,p,r],u sufficiently close to zeros of the 
corresponding spaces. Take a small t > 0 and let ||^' — ^|| < t etc. Then by (b) there is a 
K > 0 and (x', y') such that ||x' — x|| < Kt, ||y^ — y|| < Kt and ((^', p'), (y', n')) £ 'I'(x', y'). 

We have 

= u' - x', p' £ F{u') - y', y' = w' -x', ,v' G S - y' 

for some {u',v') G Graph E and re' £ JR^. We have therefore tt|| < —3:|| + ||C^—C|| < 

{K + l)t. 


50 






Thus, whenever (//, i^) G ^{u,y) with {u,y) close to {x,y) and close to (0,0) and 

t > 0 is sufficiently small, for any //', u' G that differ from /r, v at most by t, there is a 
pair (tt', y') within {K + l)t of (u, y) such that y' G F{u') — y' and u' G S — y', that is (a). 

(a) => (b). Here the arguments are similar, actually even a bit shorter. Let ((^, /r), {rj, v)) G 
'h(x, y) with (x, y) close to (x, y) and (.^, /x), (ry, zx)) close to ((0,0), (0,0)). Then as we have 
seen, (/x, zx) G y) with u = ^ + x, also close to x. Let further ||yx' — yx|| < t, ||zx' — zx|| < t. 

If t is sufficiently small, then by (a) we can find u', y' such that ||u'—u|| < Kt, ||?/^—?/|| < Kt 
with some positive K such that (/x', v') G <h(u, y). Take x' = x, ^' = u' — x, rj' = ry. Then as 
is immediate from what was explained in the first paragraph of the proof ((^,, /x'), (ry', v')) G 
'I'(x',y'). Thus T is regular near ((x, y), ((0,0), (0,0))). □ 

The proposition justifies the following definition. 

Definition 6.16. Let F : iR” ^ IR^ have locally closed graph, let S C IFF be a closed 
set, and let (x, y) G Graph F. We say that F is transversal to S at {x, y) if either y ^ S 
or y G 5 and Graph F and x S are transversal at (x, y). We say that F is transversal 
to S if it is transversal to S at any point of the graph. 

Likewise, if y G F(x)riS, we shall say that F is subtransversal to S and (x, y), provided 

d((x, y), Graph F D {X x S)) < Kd{{x, y). Graph F) + d{y, S)) 
for (x,y) of a neighborhood of (x,y). 

It is almost obvious from (a) that in case y G F * (x) n S, transversality of T to S' at 
(x,y) implies regularity of the mapping x F{x) — S near (x, 0). Without going into 
technical details the explanation is as follows. Suppose we wish to find an x such that 
z G F{x) — S. By (a) there are some (x, y) such that (0, z) G Graph F — (x, y) and 
(0,0) G SF X S — (x,y). This means that 2 : G F{x) — y, on the one hand, and y G S, on 
the other hand, as required. 

The converse however does not seem to be valid at least for a set-valued F. The 
situation here is similar to that considered in Example 14.71 However there the converse is 
also true in one important case. 

Theorem 6.17. Assume that F : iR” —>• iR™' is Lipschitz in a neighborhood of x and 
C C iR”, Q C IR^ are nonempty and closed. Assume further that y = F(x) G Q. Let 
finally 


- Q, if X G G; . ^ f R(x), if x G C; 

^ ( 0, otherwise. ’ (0, otherwise. 

Then D*^{x,0){y*) = d{y* oFc)(x), if y* G N{Q,0) and Zl*<h(x, 0)(y*) = 0 otherwise. 
Thus 

sur$(x|0) = min{||x*|| : x* G d{y* o F\c)ix), y* G N{Q,y), ||y*|| = 1}. 

(Here of course (y* o F\c){x) = 00 if x 0 C.) If we compare this with Theorem 16.151 we 
see that transversality of Fc to Q at x is equivalent to regularity of Fc — Q near (x, 0). 
We note also the following simple corollary of the theorem 
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Corollary 6.18. Under the assumption of the theorem 

0){y*) C d{y* o F)(x) + N{C, F{x)), if y* € N{Q, 0). 

The set-valued mapping in Definition 16.111 was introduced in [ 88 ] where it was shown 
that subtransversality of a collection of sets is equivalent to subregularity of the mapping. 
Theorem 16.121 was partly proved in [115] (equivalence of (a) and (c)) and partly in |122j 
(equivalence of (a) and (b)). We refer to |115] for more equivalent descriptions (some 
looking very technical) of transversality and related properties. The results relating to 
transversality of set-valued mappings and sets in the image space seem to be new. The 
exception is Theorem 16.171 that can be extracted from Theorem 5.23 of |130j . 


Part 2. Applications 
7 Special classes of mappings 

If additional information on the structure of a mapping is available, it is often possible 
to get stronger results and/or better estimates for regularity rates and to develop more 
convenient mechanisms to compute or estimate the latter. In this section we briefly discuss 
how this can be implemented for three important classes of mappings. 

7.1 Error bounds. 

By an error bound for / (at level a) on a set U we mean any estimate for the distance to 
[/ < o] in terms of (/(x) — a)"*" for x € 17. We shall be mainly interested in estimates of 
the form 

d{x,[f < a]) < K{f{x) - a)^ (7.1) 

(which sometimes are called linear or Lipschitz error bounds). 

As follows from the definition, error bounds can be viewed as rates of metric subregu¬ 
larity of the set-valued mapping Epi/(x) = [/(x),oo) = {a : (x,a) G epi /} from X into 

M. 

Lemma 7.1 (Basic lemma on error bounds). Let X be a complete metric space, letU <Z X 
he an open set, and let f be a lower semi-continuous function. Suppose that |V/|(x) > 
r > 0 for any u G U\[f < 0]. Then for any x G f7 such that /(x) < rd{x, X\U) there is a 
u such that f{u) < 0 and d{u,x) < r~^{f{x))~^. 

Proof. Without loss of generality, we may assume that / is nonnegative: just take /'*' 
instead of /. So take an x as in the statement. By Ekeland’s principle there is a u such that 
d{u,x) < r~^f{x) and /(x) -|- rd{x,u) > fiu) if x / u. We claim that /(u) < 0. Indeed, 
otherwise, by the assumption there would be an x 7 ^ u such that fiu) — f{x) > rd{x,u) ~ 
a contradiction. □ 
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For simplicity we shall speak here mainly about global error bounds, corresponding to 
U = X, at the zero level. We shall denote hy Kf the lower bound of K such that (17.ip 
holds for all x. We also set for brevity 

5 = [/<0], 5o = [/ = 0]. 

7.1.1 Error bounds for convex functions. 

We shall start with the simplest case of a convex function / (extended-real-valued in 
general) on a Banach space X. 

Theorem 7.2. Let X be a Banach space and f a proper closed convex function on X. 
Assume that 5* = [/ < 0] / 0. Then 

= inf sup {-f{x-,h)) = inf d(0, d/(x)) = inf sur(Epi/)(x,/(x)). (7.2) 

Here df{x) = {x* : f{x + h) — f{x) > {x*,h)} is the convex subdiffential. 

Proof. Equality of the three quantities on the right is not connected with regularity and 
we omit the proof. To prove the first equality, we observe that the inequality KJ < 
r = inf3,g[j^>o] sup||/j||<]^(—/'(x;/i)) is immediate from Basic Lemma because for a convex 
function |V/|(x) = — inf||;j||<;^ /'(x;/i). So it remains to prove the opposite inequality for 
which we can assume that r > 0. 

Take a positive r' and 5 such that 6 < r' < r and let TU(x) be the set of pairs (u, t) 
satisfying 

||u — x|| < t, f{u) < f{x) — r’t (7.3) 

By Ekeland’s variational principle for any 5 > 0 there is a (u, t) € TU (x) such that 
f (u) + 5\\u — u\\ attains its minimum at u. Clearly t > 0 (as /(x) > 0). We claim that 
f{u) = 0. Indeed, if f{u) > 0, then there is an h with ||/i|| = 1 such that —f'{u;h) > r', 
that is f{u + th) < f{u) — r't for some t > 0. Set u = u + th. Then f{u) < fiu) — (5||u —?l|| 
and we get a contradiction with the definition of u. 

Thus fifd) = 0 which means that 

d{x, Sq) < llu — x|| < t < — fix) 

r 

and we are done as r' can be chosen arbitrarily close to r and x is an arbitrary point of 

[/ > 0 ]. □ 

There is another way to characterize Kj va. terms of normal cones to [/ < 0]. 
Theorem 7.3. For any continuous convex function f on a Banach space X 

Kf= inf inf{r > 0 : 1V([/= 0], x) Pi Bx* C [0, r]d/(x)}. (7.4) 

2 : 6 [/= 0 ] 


53 


7.1.2 Some general results on global error bounds. 

Let us turn now to the general case of a Isc function on a complete metric space. 

Denote now by Kf{a, /3) (where /3 > a > 0) the lower bound of K such that 

d{x, [f < a]) < Kf{x)+ if a < f{x) < /3. 

Clearly, Kf = liuiis^oo Kf{0,13). 

Theorem 7.4. Let X be a complete metric space and f a lower semicontinuous function 
on X. If [f < 0] ^ 0, then 


inf 

*e[o</</3] 


|V/|(x) 


inf Kf{a,f3) 

aG[0,/3) •' 


Proof. Set r = infa;g[o</</3] |V/|(x). The inequality Kf{a,j3) ^ > r for 0 < a < /? is 
immediate from Lemma EH This proves that the left side of the equality cannot be 
greater than the quantity on the right. To prove the opposite inequality it is natural 
to assume that > ^ > 0 for all a G [0,/3). For any x G [f > a] and any 

e > 0 such that f{x) — s > a choose a u = u{e) G [/ < f{x) — e] such that d{x,u) < 
{l + £)d{x, [/ < f{x) — £]) < (l + e)^“^e and therefore n —x as e —)• 0. On the other hand, 
f,d{x,u) < f{x) — f{u) which (as u ^ x) implies that ^ < |V/|(x), whence ^ < |V/|(x), 
and the result follows. □ 


As an immediate consequence we get 
Corollary 7.5. Under the assumption of the theorem 

K7^> inf |V/|(x). 
f “xG[/>0]' 

A trivial example of a function / having an isolated local minimum at a certain x and 
such that inf / < /(x) shows that the inequality can be strict. This may happen of course 
even if the slope is different from zero everywhere on [/ > 0]. In this case an estimate of 
another sort can be obtained. Set (for /3 > 0) 

d/(/3) = sup d{x, [f < 0]) 

3;g[/</3] 


and define the functions 

= sup{|^^^^ : |/(x) - t| < e}; Kf{t) = lim 

Proposition 7.6. Let /3 > 0. Assume that [/ < 0] / 0 and |V/|(x) >r>0i/xG[0< 
/ < /3]. Then 

rh 

df{(3) < / Kf{t)dt. 

Jo 

Following the pioneering 1952 work by Hoffmann [78] (to be proved later in this sec¬ 
tion), error bounds, both for nonconvex and, especially, convex functions have been in¬ 
tensively studied, especially during last 2-3 decades, both theoretically, in connection 
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with metric regularity, and also in view of their role in numerical analysis, see e.g. 
[33l [65l 11231 11341 11631 1178] . Basic lemma was proved in [ 88 ], its earlier version corre¬ 
sponding ioU = X was proved by Aze-Corvellec-Lucchetti and appeared in m- A finite 
dimensional versions of Theorems 17.21 and 17.31 were proved in Lewis-Pang [123] . Klatte and 
Li [111] , The equality KJ^ = inf{(i(0, 9/(x)) : x G [/ > 0]} in Theorem 17.21 was proved 
by Zalinescu (see [177] ). The first two equalities in the theorem can be found in [TJ] [T3] 
and the third equality for polyhedral functions on in |131| . Theorem 17.31 was proved 
by Zheng and Ng [178] and Theorem 17.41 bv Aze and Corvellec in [12] . The papers also 
contain sufficiently thorough bibliographic comments. Here we follow |95] where proofs of 
all stated and some other results can be found. 

7.2 Mappings with convex graphs. 

7.2.1 Convex processes. 

We start with the simplest class of convex mappings known as convex processes. By 
definition a convex process is a set-valued mapping A : X ^ Y from one Banach space 
into another whose graph is a convex cone. A convex process is closed if its graph is a 
closed convex cone. The closure c\A of a convex process A is defined by Graph (cM) = 
cl(Graph ^). We shall usually work with closed convex processes. A convex process 
is bounded if there is an r > 0 such that ||y|| < r||x|| whenever y G .4.(x). A simplest 
nontrivial example of an unbounded closed convex process is a densely defined closed 
unbounded linear operator, as say the mapping x(-) x(-) from G[ 0 , 1 ] into itself which 

associates with every continuously differentiable x(-) its derivative and the empty set with 
any other element of C[ 0 , 1 ]. 

According to Definition 15.11 given a convex process A : X ^ Y, the adjoint process 
A* '.Y* ^ X* (always closed) is defined by 

A*{y*) = {x* G A* : (x*,x) < (y*,?/), V (x,y) G Graph A}. 

By we denote a convex process from X into Y whose graph is the intersection of 
—Graph (A*)* with X x Y, that is .A**(x) = {y : —y G (.4.*)*(—x)}. Simple separation 
arguments show that A** = clA for any convex process. 

Proposition 7.7. Let A : X ^ Y be a convex process. Then A{Q) is a convex set if so 
is Q and for any xi, X 2 G A 


.4.(xi) -I- .4 .(x2) C .4.(xi -I- X 2 ). 

Proposition 7.8. Let K G X be a convex closed cone. Then for any x G A the tangent 
cone T{K,x) is the closure of the cone generated by K — x. In particular K C T{K,x). 

The propositions are the key element in the proof of the following fundamental property 
of convex processes. 

Theorem 7.9 (regularity moduli of a convex process). For any closed convex process 
A : X ^ Y from one Banach space into another 

C{A) = C*{A*) = sur^(0|0) = contr^(0|0). 
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Note that the left inequality is equivalent to ||^ ^||_ = ||(^ ^)*||+ (cf [25]). 

Proof. We hrst observe that the right equality is a consequence of the other two in view 
of Proposition 15.21 The inequality C*{A*) > C{A) follows from Theorem 15.41 The same 
theorem together with the dehnition of Banach constants implies that 

C*{A**) > > C{A*) > C*{A*). 

But = ^, as ^ is closed, so that C'*(^**) = C'*(^)) < C{A) (see again Theorem 15.4p . 
This proves the left equality. 

Passing to the proof of the middle equality, we first observe that by Proposition 15.21 
C{A) = contr^(0|0) > sur^(0|0) as the rate of surjection can never exceed the modulus 
of controllability. On the other hand, by Proposition 17.81 DAi{). Ollhl C DA{x,y){h) for 
all (x,y) G Graph A and all h. Hence by Theorem 15.131 sur^f0|0l > 0)). But 

DA{0,0){h) = A{h) as the tangent cone to a closed convex cone at zero coincides with 
the latter. Thus sur^(0|0) > C{A). □ 

Corollary 7.10 (perfect regularity of convex processes). Any closed convex process is 
perfectly regular at the origin. 

Note that a convex process may be not perfectly regular outside of the origin. For 
instance, consider in the space CPO, 1] the mapping into itself dehned by A{x{-)) = x{-) + K 
where K is the cone of nonnegative functions. 

We conclude this subsection by considering the effect of linear perturbations. If A is 
a convex process, then so is .4 + ^ where 4 is a linear bounded operator from X into 
Y. Thus if A is closed, then 4 + 4 is perfectly regular at the origin and we get as an 
immediate consequence of Theorem 15.281 

Theorem 7.11 (radius of regularity of a convex process). If A: X is a closed convex 
process, then 

rad4(0|0) = sur4(0|0). 

Convex processes were introduced by Rockafellar [15711158] as an extension of linear 
operators and subsequently thoroughly studied by Robinson |148] , Borwein [23l IM] and 
Lewis [12011121] . In particular, [148] contains an extension to convex processes of Banach- 
Schauder open mapping theorem. Another remarkable result (which is actually a special 
case of Theorem 5 in the paper) can be reformulated as follows: let X and Y be Banach 
spaces, and let A : X ^Y and T : X ^ Y be closed convex processes. Then C{A — T) > 
C{A) — ||T||-. The result equivalent to the equality C{A) = C*{A*) iTheorem 17.91) was 
proved and further discussed in [231124] and Theorem 17.111 in [120] along with the equality 
of the radius and distance to infeasibility for convex processes.. 

7.2.2 Theorem of Robinson-Ursescu. 

Theorem 7.12 (surjection modulus of a convex map). Let X and Y be Banach spaces, 
and let F : X ^Y be a set-valued mapping with convex and locally closed graph. Suppose 
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there are {x,y) G Graph F, a > 0 and /? > 0 sueh that F{B{x,a)) is dense in B{y,j3). 
Then 

surFfxly) > —. (7.5) 

a 

Proof. We can set x = 0 , y = 0 . It is clear that F{taBx) is dense in tfiBy for any 
t G (0,1). Denote r = fija. We shall show that, given a 7 > 0, there is an e > 0 such that 
F{B{x, (l + 7)t)) is dense in B{v,rt) if ||x|| < s, ||u|| < e and v G F{x). The theorem then 
will follow from Corollary 13.81 

So take a small e > 0, and let ||xo|| < e, HuoH < s and uq G F{xo). Let further 
y G B{vo,rt) for some t G (0,e). Consider the ray emanating from vq through y and let 
yi be the point of the ray with ||yi|| = j3, that is there is a A > 0 such that 


We have ||yi — y\\ 


y = 


l + A 


2/1 + 


A 


l + A 


Vo, 


A > 


fd — e 
rt 


~ 2/11) tlia't is 

, ^ II2/1 - V > 13- e - rt ^ 
lbo-2/||“ rt 


l + A > 


/3-g 

rt 


In particular, if /I > (1 + 2r)e, which we may assume, then A > 1. 

Take a 5 > 0. By the assumption there is an xi G aB such that ||yi — ui|| <6 for some 
ui G F{xi). Set 


V = 


1 A 1 A 

-Vl + , , ' Uo, x = ---Xi + ---Xo 


l+A l+A l+A l+A 

Then v G F{x) as Graph F is convex. We have ||y — u|| < (5/(1 + A) < 5/2 and 


~ 3^o|| < 


1 


If 


l + A 

1 + 7 > 


II a + e a + e 
Xi — Xoll < ^-r < - rt. 


l + A ft — £ 
a + £ [3 


f3 — £ a 

this completes the proof as 5 can be chosen arbitrary small. 


□ 


As a corollary we get 

Theorem 7.13 (Robinson-Ursescu [151( I167j ). Let X and Y be Banach spaces. If the 
graph of F : X ^ Y is convex and closed and y G int F{X), then F is regular at any 
(x, y) G Graph F. 

Proof. Let y G F{x). We have to show that there are a; > 0 and /3 > 0 such that 
F{B{x, a)) is dense in i?(y, f3) which is easy to do with the help of the standard argument 
using Baire category. □ 
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7.2.3 Mappings with convex graphs. Regnlarity rates. 

Here we give two results containing exact formulas for the rate of surjection of set-valued 
mappings with convex graph. 

Theorem 7.14. Let F : X ^ Y be a set-valued mapping with convex and locally closed 
graph. IfyG F(x), then 

surF(x|y) = ^hm^ inf (^||x*|| -h ^^craph (F-(x,y))(a;*, y*)) • 


The theorem was proved in loffe-Sekiguchi m, see also for [95] for a short proof. 
It allows to also get a ’’primal” representation for the rate of surjection of a convex set¬ 
valued mapping. The key to this development is the concept of homogenization Q of 
a convex set Q <Z X which is the closed convex cone va. X x IR generated by the set 
Q X {!}. It is an easy matter to verify (if Q is also closed) that (x,t) E Q if and 
only if X E tQ if t > 0 and x E the recession cone of Q, if t = 0. (Recall that 
Q°° = {/i E <5 : X -I- /i E Q, Vx E Q}.) 

Given a set-valued mapping F : X ^ Y with convex closed graph, we associate 
with F and any (x, y) E X x Y (not necessarily in the graph of F) a convex process 
: X X IR ^ Y whose graph is the homogenization of Graph F — (x,y). It is easy to 
see that 




t(F(x-b -y), if t > 0, 
F^{h), ift = 0, 


0, if t < 0, 


where F°° is the “horizon” mapping of F whose graph is the recession cone of Graph F: 


Graph F°° = {{h,v) : {x h,y + v) £ Graph F, V(x, y) E Graph F}. 

If (x,y) = (0,0), we shall simply write F (without the subscript) and call this convex 
process the homogenization of F. 

In the theorem below we use the e-norms in X x 1R\ ||(/i,t)||£ = max{||x||,et} and 
denote by Ce{F(^x,y)) Banach constant of F(^x,y) corresponding to this norm. 

Theorem 7.15 (primal representation of the surjection modulus). If F : X ^ Y is a 
set-valued mapping with convex and locally closed graph, then 


surT(x|y) = lim (^’( 5 , y)). 

Proof. We have (setting below h = t{x — x), v = t{y — y)) 

Graph = {{x*,y*, X) : {x*,h) - {y*,v) -I Xt < 0 : 

V {h,v,t) E Graph 

= {(a^*,y*,A) :t[{x*,x -x) - {y*,y-y) + X] <0: 

V (x, y) E Graph F, f > 0} 
= {(x*,y*,A) :SGraphF-(x, 5 )(^%-y*) + A<0}. 
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As the support function of Graph F — (x, y) is nonnegative, it follows that A < 0 whenever 
A) G Graph The norm in X* x M dual to || • ||e is || {x*, A)||£ = ||x*|| +e“^|A|. 

Let de stand for the distance in X* x M corresponding to this norm. Then 

de{0,F^- --^{x,y){y*)) = inf{||x*|| + e-^|A| : scraph -?/*) + A < 0} 

= in/(||x*|| + e“^SGraph F-(x,y)ix*, -?/*))■ 

It remains to compare this with Theorem 17. 141 to see that 

smF{x\y)= lim inf de{0, J^^ y^iy*)) 

and then to refer to Theorem l7.9l to conclude that the quantity on the right is precisely the 
limit as e —)• 0 of inf||y*||=i Ce{clF^x,y){y*)): where the closure operation can be dropped 
because as we mentioned the norms (and therefore the Banach constants) of a convex 
process and its closure coincide. □ 

The concept of homogenization was introduced by Hormander m- The idea to apply 
homogenization for regularity estimation goes back to Robinson’s |150j . His main result 
actually says that surF(x|y) > g)). In a somewhat different context homogenization 

techniques was applied by Lewis m for estimating distance to infeasibility of so called 
conic systems. Full statement of Theorem 17.151 was proved also in |102j . We have not 
discussed here some well developed problems relating to regularity of maps with convex 
graphs, e.g. stability under perturbations of systems of convex inequalities - see e.g. 
[29l [95l I149j and references in the first two quoted papers. 

7.3 Single-valued Lipschitz maps. 

The collection of analytic tools that allow to compute and estimate regularity moduli of 
Lipschitz single-valued mappings contains at least two devices, not available in the general 
situation, which are a lot more convenient to work with than coderivatives. The first is 
the scalarized coderivative (associated with a subdifferential): 

V*F{x){y*) = d{y* o F){x) 

and the other results from suitable local approximations of the mapping either by homo¬ 
geneous set-valued mappings or by sets of linear operators. 

The following result is straightforward. 

Proposition 7.16. If F : X ^ Y is Lipschitz continuous near x & X, then for every 
y*eY* 

dF{y*oF){x) = D*FF{x){y*)- (7.6) 

Things are more complicated with the Dini-Hadamard subdifferential. From now on 
we assume that all spaces are Gateaux smooth. 

Definition 7.17. A homogeneous set-valued mapping A : X ^ Y is a strict Hadamard 
prederivative of F : X ^ Y atxif||Al||+<oo, and for any norm compact set Q C A 

F{x + th) — F{x) C tA{h) + r{t,x)t\\h\\BY, 'I h £ Q, (7.7) 
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where r(t, x) = r{t, x, Q) —>• 0 when x —>■ x, t ^ +0. If moreover the inclusion holds with 
Q replaced by Bx then A is called strict Frechet prederivative of F at x. Clearly, for a 
Frechet prederivative we can write r{t,x) in the form p{t, \\x — x||). 

There are some canonical ways for constructing prederivatives. The hrst to mention is 
the generalized Jacobian introduced by Clarke [3B] for mappings in the finite dimensional 
case and then extended to some classes of Banach spaces by Pales and Zeidan |139l 1140] , 
Another construction, not associated with linear operators was intruduced in [82]. Take 
an e > 0 and set 

FLeih) -.= {X~^{F{x + Xh) — F{x)) x, x + A/i G dom FTl i?(x, e), A > 0}, h G X. 
Then 0 G ^£(0) and for t > 0 we have 

T~Le{th) = t{{tX)~^{F{x + tXh) — F{x)) : x, x + tXh G dom F n B{x, e), X > 0}, 

that is 7ie{th) = t7ie{h). Thus Tie is positively homogeneous and it is an easy matter to 
see that (CZD holds with r{t, x) = 0 . 

We say that F : X —>■ T is directionally compact at x G dom F if it has a (norm) 
compact-valued strict Hadamard prederivative with closed graph. It is strongly direction- 
ally compact if there is a compact-valued strict Frechet prederivative with closed graph. 

The simplest, and probably the most important example of a directionally compact 
(actually even strong directionally compact) mapping is an integral operator associated 
with a differential equation, e.g. 

x(-) ^ T(x(-))(t) = x(t) - [ f{s,x{s))ds 

Jo 

with f{t, •) Lipschitz with summable rate. 

Proposition 7.18 f| 86 j). If F : X ^ Y is Lipschitz continuous near x, then 

dH{y* o F){x) C D*HF{x){y*), Vy* G XC 

If furthermore F : X ^ Y is directionally compact at x, then 

D*HF{x){y*) = dniy* o F){x) k D*GF{x){y*) = dciy* o F){x), ^y* eY*. 

Combining this proposition with Theorem 15.211 we get 

Theorem 7.19. Let F : X ^ Y satisfy the Lipschitz condition in a neighborhood ofx. If 
F is directionally compact at all x of the neighborhood, then 

surT(x) > lim inf{||x*|| : x* £ dniy* o F)(x), ||7/*|| = 1, llx —x||<e}, 

e-s-O 

The obvious inequality 

{y* oF){x + h)- {y* o F) (x) > inf (y*, w) 

w&'H{x){h) 

(where 'H(x) is a strict prederivative at x) leads to the estimate surT(x) > liminf C*{fH{x)) 

X^X 

under the assumptions of the theorem. A better result can be proved with the help of the 
general metric regularity criteria if F has a strict Frechet prederivative at x. 
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Theorem 7.20. Assume that Y is Gateaux smooth and F : X ^ Y satisfies the Lipschitz 
condition in a neighborhood of x and, moreover, admits atxa striet Frechet prederivative 
Ti with norm compact values such that for any y* with ||y*|| = 1 

sup inf {y*,w)>p>0. (7.8) 

11^1=1^&n{h) 


Then suiF{x) > p. 

Proof. With no loss of generality we may assume that the norm in Y is Gateaux smooth 
off the origin. Take an e G (0, p/S) and an r > 0 such that 

F{x) — F{x)&'H{x) + e\\x—x\\, (7.9) 

O 

if x,x' G B{x,r). Take an x G B(x,r/2) and a y gY, different from F{x). Let y* denote 
the derivative of || • || at y — F(x). Then 

lim. t~^ (\\y — F(x)+tw\\ — \\y — F(x)\\) = {y*,w), for every wGY. (7.10) 

t->-0 

By ()7.8p . there is an /i G Sx such that 

{y*,w)>p — e, for all wG'H{h). (7.11) 

Since the set —TL{h) is compact and the limit in (j7.10p is uniform with respect to w from 
any fixed compact set, we conclude that for sufficiently small t > 0 

\\y — F{x) — tw\\ — \\y — F{x)\\ + {y*,tw) < te for all w G Ti{h). 

This and (I7.11h imply that 

\\y - F{x) - tw\\ < \\y - F{x)\\ - {y*,tw) + et < \\y - F(x)|| - t{p - 2e) (7.12) 

for all w G Ti{h). Let x' := x + th. Then \\x' — x\\ = \\th\\ =t< r/2, hence x' G B{x,r). 
Since Ti is positively homogeneous, we have 7^(x' — x) = Ti{th) = t'H{h). Thus by (I7.9p 
there \s a w GTi{h) such that 


||F(x') — F{x) — tw\\ < te. 


(7.13) 


Now, we are ready for the following chain of estimates 


\\y-Hx')\\ 


< ||T(x) — T(x') + + ||y — F{x) — 

< et+ \\y - F{x)\\ - {p - 2e)t fbv (I7.13P and (|7.12|L 

= \\y - -{p- 3e)t = \\y - T(x)|| -{p- 3e)||x' - x 


It remains to apply the criterion of Theorem 13.21 


□ 


A slight modification of the proof allows to get the following 
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Theorem 7.21. Assume that F : X ^ Y satisfies the Lipschitz condition in a neighbor¬ 
hood of X and, moreoover, there are a homogeneous set-valued mapping %: X with 
norm compact values and fi > 0 such that ra holds and 

F{x + h)- F{x) C nih) + fi\\x' - x\\By. (7.14) 


Then surF(x) > p — fi. 

This theorem, in turn, allows us to look at what happens when a Lipschitz mapping is 
approximated by a bunch of linear operators. Indeed, if T is a collection of linear operators 
from X to Y, then the set-valued mapping X 3 x i—>• Tl{x) := {Tx : T G T} is of course 
positively homogeneous. It is an easy matter to see that TL inherits some properties of 
T: for us it is important to observe that when T is (relatively) norm compact in C{X, Y) 
with the norm ||T|| = sup{||Tx|| : ||x|| < 1}, then so are the values of Ti, if T is bounded, 
then the values of % are also bounded etc.. Thus we come to the following conclusion. 

Theorem 7.22. Assume that for a given x € dom F there is a convex subset T C C{X,Y) 
which is norm compact in FfiX, Y) and has the following two properties: 

(a) there is a fi > 0 such that for any x,x' in a neighborhood ofx there is a T ^ T 
such that 

||F(x) - F{x') - T{x - x')|| < /3\\x - x'll; (7.15) 

(b) there are p > 0 and e > 0 such that for any T 

epBy C T{sBx)- (7.16) 


Then smF{x) > p — fi. 

Scalarization formulas first appeared in [83] for mappings between finite dimensional 
spaces and m for mappings between Frechet smooth spaces, although scalarized coderiva¬ 
tives were considered already in [8211112] . The very term “coderivative” was introduced 
in [82|. The concept of prederivative was introduced in [82| and a characterization of 
directional compactness in [SB], see also [TUB] for an earlier result. 

Theorems 17.201 and 17.211 will appear in [32] . Theorem 17.221 was proved in [31] . An 
earlier result without constraints on the domain of the mapping was proved by Pales in 
[138] We also refer to [32] for a shorter proofs of the last theorem. Note that the convexity 
requirement in Theorem 17.221 is essential (consider, for instance, F(x) = |x| : M ^ M and 
T containing two operators Ti(x) = x and T 2 {x) = —x). Because of this requirement the 
estimate provided by Theorem 17.221 is generally less precise than those of Theorems 17.191 
and 17.20] (consider for instance the mapping IR? —)• IR ; F{xi,X 2 ) = |xi| — |x 2 |), but it 
can be easier to apply in certain cases (e.g. in the finite dimensional case when we can 
take the generalized Jacobian as T - see [36]). 

7.4 Polyhedral sets and mappings 

This subsection contains some elementary results concerning geometry of polyhedral sets 
in iR” and regularity of set-valued mappings with polyhedral graphs. Deeper problems 
associated with variational inequalities over convex polyhedral sets will be discussed in 
the next section. 
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Definition 7.23 (polyhedral sets). A convex polyhedral set (or a convex polyhedron) Q C 
iR” is an intersection of finitely many closed linear subspaces and hyperplanes, that is 

Q = {x ^ IR"' : (x*, x) < ai, i = 1,..., /c; (x*, x) = a*, i = k + 1,..., m} (7-17) 

for some nonzero x* G IRP' and G M. Following [SS] we shall use the term polyhedral set 
for finite unions of convex polyhedra. 

Clearly, any polyhedral set is closed. Also: as any linear equality can be replaced 
by two linear inequalities, we can represent any polyhedral set by means of a system of 
linear inequalities only. Elementary geometric argument allow to reveal one of the most 
fundamental property of polyhedral sets: orthogonal projection of a polyhedral set is a 
polyhedral set. In fact a linear image of a polyhedral set is polyhedral (see |158j for this 
and other basic properties of polyhedral sets). 

A set-valued mapping ^ IRJ^ is (convex) polyhedral if so is its graph. Our primary 
interest in this section is to study regularity properties of such mappings. 

Proposition 7.24 (local tangential representation). Let Q C IRL he a polyhedral set and 
X G Q. Then there is an e > 0 such that 

Q n B{x,e) = x + T{Q,x) n {eB). 

As an immediate consequence, we conclude that regularity properties of a polyhedral 
set-valued mapping with closed graph at a point of the graph are fully determined by the 
corresponding properties at zero of its graphical derivative at the point. 

One more useful corollary concerns normal cones of a polyhedral sets. 

Proposition 7.25. Let Q C be a polyhedral set. Then for any x G Q there is an 
e > Osuch that N{Q, x) C N{Q,x) for any x G Q H B{x,e). 

Our first result is the famous Hoffmann theorem on error bounds for a system of linear 
inequalities. Set o = (ai,..., am) G iR™" and let Q{a) be defined by (I7.17|) . 

Theorem 7.26 (Hoffmann). Given x* G iR". Then there is a K >0 such that the 
inequality 

k m 

d{x,Q{a)) < k(^'^{{x*,x) - Ui)'^ + ^ |(x-,x)-aj|^ 

i=l i=k-\-l 

holds for all x G iR" and all a G iR*" such that Q{a) 

Proof. We shall apply Theorem 17.21 Take an a and set 

k m 

fix) = '^{{x*i,x) - ai)^ + |(x*,x)-ai|. 

i=l i=k-\-l 


Then Q{a) = [/ < 0]. Set 

/i(x) = {i G {!,... ,fe} : (x*,x)<ai}, 

/o(x) = {i G {fe -M,... ,m} : (x*,x) = aj, 
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J+(x) = {i G {1, ... ,m} : (x*,x)>ai}; 
J_(x) = {i G {A: -I- 1, ... ,m} : (x*,x)<ai} 






Then 


df{x)= Y1 Y1 Y1 

i£ll{x) i€lo(x) i£j+{x) iGJ-{x) 

If X ^ Q(a), then 0 0 df(x) and d{0,df{x)) > 0. 

We observe now that the dependence of df{x) of x and a is fully determined by the 
decomposition of the index set 1,..., m. Let S be the collection of all decompositions of the 
index set into four subsets Ii, Iq, J+, J- such that Ii C {1,..., k}, Iq, J_ C {k+1, ..., m} 
and 

ie/i ie/o iGJ+ ieJ- 

For any u G S denote by 7 ( 0 ") the distance from zero to the set in the right-hand side of 
the above inclusion, and let K stand for the upper bound of 7 ((t)“^ over u G S. Then 
K < 00 since S is a finite set. Clearly, K does not depend on either a or x. On the other 
hand, Kdf{x) > 1. It remains to refer to Theorem 17.21 to conclude the proof. □ 

As an immediate consequence, we get 

Theorem 7.27 (regularity of convex polyhedral mappings). Let F : IR^ ^ IRF he a 
polyhedral set-valued mapping. Then 

(a) there is a K > 0 such that d{y,F(x)) < K\\x — T|| for any x G dom F and any 
(x, y) G Graph F; 

(b) there is a K > 0 (different from that in (a)) such that d{x,F~^{y)) < Kd{y,F{x)) 
for any x G dom F and y G F{X). 

and 

Theorem 7.28 (global subtransversality of convex polyhedral sets). Any two convex 
polyhedral sets Qi and Q 2 with nonempty intersection are globally subtransversal: there is 
a K > 0 such that 

d{x, Qi n Q 2 ) < K{d{x, Qi) d{x, Q 2 ))- 

To prove Theorem 17.271 we have to apply the Hoffmann estimate to the graph of F. 
Concerning Theorem 17.281 it should be observed that global transversality does not imply 
transversality at any point. As a simple example, consider the half spaces = {x : 
(x*,x) > 0} and S '2 = {x : (x*,x) < 0} with some x* / 0. The intersection of the sets 
is Ker x* 7 ^ 0. But the inclusions xi — x G S'! and X2 — x G S '2 imply (x*,xi) > (x*,X2), 
hence (see Definition 16.lip S'! and S '2 are not transversal at points of Ker x*. 

The results easily extend to all (not necessarily convex) polyhedral mappings. 

Theorem 7.29 (subregularity of polyhedral mappings). Let F : ^ JR™ be a semi- 

linear set-valued mapping with closed graph. Then 

(a) there is a K > 0 such that for any x G dom F there is an e > 0 such that 
d{y,F{x)) < K\\x — x|| for all (x,y) G Graph F such that ||x — x|| < e; 

(b) there is a K > D (different from that in (a)) such that for any (x,y) G Graph F 
there is an e > D such that d{x,F~^{y)) < Kd{y, F{x)) if ||x — x|| < Ks. Thus F is 
subregular at any point of its graph. 
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Proof. We have F{x) = where all Fi are convex polyhedral set-valued map¬ 

pings. By Theorem 17.271 for any i there is a Ki such that d{y,Fi{x)) < Ki\\x — x|| 
for any x G dom Fi and any {x,y) G Graph F^. Now fix some x G dom F, and let 
I = {i : X G dom Fi}. Choose an e > 0 so small that d(x,dom Fi) > e if i 0 / 
and ||x — x|| < s. (Clearly, such an s can be found as all dom Fi are polyhedral sets, 
hence closed.) If now y G F{x) and \\x — x|| < e, then I{x,y) = {i : y G Fi{x)} C /. 
On the other hand, as we have seen, there are Ki such that y G T)(x) implies that 
d{y,Fi{x)) < Ki\\x — x||. Thus, if ?/ G F{x) and ||x —x|| < e, then 

d{y,F{x))< max d{y, Fi{x)) < {mpcKi)\\x — x\\. 

i£l{x,y) i 

This proves the first statement. 

To prove the second, we apply the first to F~^ and find K and s such that d{x, F~^ (y)) < 
K\\v — y\\ if u G F{x) and ||u — y|| < e. If d{y,F{x)) < e, it follows that d{x,F~^{y)) < 
Kd{y, F{x)). This inequality trivially holds if d{y, F{x)) > s and ||x — x|| < Ke. □ 

The property in the second part of the theorem falls short of metric regularity because 
it does not guarantee that the e will be uniformly bounded away from zero if we slightly 
change y. The following simple example illustrates the phenomenon. 

Example 7.30. Let X = Y = R, Y, and let 

{ IR+, if X > 0, ( M-, if X < 0, 

M, if X = 0, ; F 2 {x) = < M, if x = 0, 

0, if X < 0 I 0) if X > 0 

and F{x) = Fi{x) U F 2 {x). Fix some y > 0 and x < 0. Then F~^{y) = 77+ and 

d{x, F~^{y)) = |x|, d{y, F{x)) = y so that for no K the inequality d(x, F~^{y) < Kd{y,F{x)) 
holds in a neighborhood of (0,0). 

Corollary 7.31 (subtransversality of polyhedral sets). Any two semi-linear sets Qi and 
Q 2 with nonempty intersection are subtransversal at any common point of the sets. 

d{x, Q\ n Q 2 ) < K{d{x, Qi) d{x, Q 2 ))- 

To conclude, we mention that for any polyhedral mapping F : Rh^ ^ the set of 
critical values (that is such y G iR'" such that surT(x|y) = 0 for some x G F~^{y)) 
is a polyhedral set of dimension smaller than m. This will immediately follow from the 
semi-algebraic Sard theorem stated in the next subsection. 

7.5 Semi-algebraic mappings, stratifications and the Sard theorem. 

Most of the results of this subsection, including the Sard theorem can be extended to a 
wide class of objects, so called definable sets, mappings and functions. We however confine 
ourselves here to semi-algebraic functions whose definition is much simpler (compare with 
the general definition of definability) and does not require any specific efforlH 

^It should be mentioned that recently Barbet, Dambrine, Daniilidis, Rifford [18] proved a remarkable 
result containing extensions of the Sard theorem to some other important classes of non-smooth functions. 
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We shall concentrate basically on two topics: consequences of the general theory and 
studies associated with semi-algebraic geometry, mainly in connection with the Sard the¬ 
orem. 

7.5.1 Basic properties (see mm)- 

A semi-algebraic set in is by dehnition a union of finitely many sets of solutions of a 
finite system of polynomial equalities and inequalities of n variables: 

{x E ]R^ : Pi{x) = 0, i = 1, ... Pi{x) <0, i = k + 1, ..., m}. 

As immediately follows from the definition, every algebraic set is semi-algebraic, every 
polyhedral set is semi-algebraic, unions and intersections of finite collections of semi- 
algebraic sets are again semi-algebraic. The main fact of the semi-algebraic geometry is 
the deep Tarski-Seidenberg theorem which roughly speaking says that a linear projection 
of a semi-algebraic set is a semi-algebraic set. This theorem determines stability of the 
class of semi-algebraic sets with respect to a broad variety of transformations. 

A mapping (no matter single or set-valued) is semi-algebraic if its graph is semi- 
algebraic. Here is a list of some basic properties of semi-algebraic sets and mappings: 

• the closure and interior of a semi-algebraic set is semi-algebraic; 

• Cartesian product of semialgebraic sets is semi-algebraic; 

• composition of semi-algebraic mappings is semi-algebraic; 

• image and preimage of a semi-algebraic set under a semi-algebraic mapping is semi¬ 
algebraic; 

• derivative of a (single-valued) semi-algebraic mapping is semi-algebraic; 

• the upper and lower bound of a finite collection of extended-real-valued semi¬ 
algebraic functions is semi-algebraic; 

• if we have a semi-algebraic function of two (vector) variables, then its upper or lower 
bound with respect to one of the variables on a semi-algebraic set is semi-algebraic; 

• if T is a semi-algebraic set-valued mapping such that every F{x) is a finite set, then 
the number of elements in each F{x) does not exceed certain finite N . 

For us, in the context of variational analysis and, especially, regularity theory, the most 
important is that 

• subdifferential mapping of a semi-algebraic function or the coderivative mapping of 
a semi-algebraic map is semi-algebraic (no matter of which subdifferential on iR”: Frechet, 
Dini-Hadamard, limiting or Clarke, we are talking about); 

• slope of a semi-algebraic function is a semi-algebraic function of the point; 

• rates of regularity of a semi-algebraic functions are also semi-algebraic functions of 
the point of the graph. 

Definition 7.32. A finite partition (Mj) of a set Q C fR" is called -Whitney stratifi¬ 
cation of Q if each Mi is a C'’-manifold and the following two properties are satisfied: 

(a) if (xfc) C Mi converges to some x belonging to another element {Mfi of the partition, 
and the unit normal vectors Vk E Nxf,Mi converge to some v, then v E N^Mj-, 

(b) if Mj n clMj 7 ^ 0, then Mj C clMj. 
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Elements of partitions are usually called strata. The following remarkable fact is due to 
S. Lojasievicz: 

Theorem 7.33 (stratification theorem). Given a semi-algebraic set Q C IR^ and an 
r G N. Then Q admits a Whitney stratification into semi-algebraic -manifolds. 

Of course, stratification is not unique. But it is easy to understand that maximal 
dimensions of the strata coincide for all Whitney stratifications. This observation justifies 
the following 

Definition 7.34. The dimension dim Q of a semi-algebraic set Q is the maximal dimen¬ 
sion of the strata in Whitney stratifications of Q. 

The most important consequence of the stratification theorem is a Sard-type theorem 
for semi-algebraic set-valued mappings, 

Definition 7.35. Let F : iR” ^ IRF be a set-valued mapping with semi-algebraic graph, 
and let d stand either for the limiting or for the Clarke subdifferential. A point y € iR™ 
is a critical value of F if there is an x G iR” such that y G F{x) and 0 G D*F{x\y){y*) for 
some y* fit). 

Theorem 7.36 (semi-algebraic Sard theorem). Critical values of a semi-algebraic set¬ 
valued mapping F : iR” ^ JR^ form a semi-algebraic set of dimension not exceeding 
m — 1. 

In particular an extended-real valued semi-algebraic function can have at most finitely 
many critical values. 

For the theory of semi-algebraic sets and mappings see [211 1175]. The Sard theorem 
was first proved by Bolte-Daniilidis-Lewis [22| for real-valued functions and then by Ioffe 
|9nj for set-valued mappings (in both cases the theorems were stated for more general 
classes of objects - semi-analytic functions in [22| and arbitrarily stratifiable maps in [90] ). 

7.5.2 Trans versality. 

We are finally ready to extend transversality theory (not just the definition) beyond the 
smooth domain. To begin with, we observe that a direct extension of Proposition 11.121 
does not hold if F is not smooth. 

Example 7.37. Consider the function 

f{x, w) = \x\ — |t(;| 

viewed as a mapping from into M. This mapping is clearly semi-algebraic, even 
polyhedral. It is easy to verify that the mapping is regular at every point with the modulus 
of surjection identically equal to one (if we take the £°° norm in IR^). Furthermore 

Q = /"Ho) = {(3;,W') : HI = |u;|} 

and the restriction to Q of the projection (x, w) ^ w is also a regular mapping with the 
modulus of surjection equal one. However, the partial mapping x ^ fix,0) = |x| is not 
regular at zero. 
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However, the following statement is true. 

Proposition 7.38 ([93]). Let F : IRJ^ x IR^ FF be a semi-algebraic set-valued mapping 
with locally closed graph, and lety E F{x,p). Assume that 

(a) F is regular at {{x,p),y); 

(b) the set-valued mapping IRJ^ x FF =1 Fi^ associating the set {p : y E F{x,p)} with 
any {x,y) E FF x FF is regular at {{x,y),p); 

(c) there is a Whitney stratification (Mj) o/Graph F such that the restriction of the 
projection {x,p) p to the set Si = {{x,p) : {x,p,y) E Mi}, where Mi is the stratum 
containing ix,p,y), is regular at ix,p). 

Then Fp : x F{x,p) is regular at {x,y). 

It is now possible to state and prove a set-valued version of Theorem 11.131 

Theorem 7.39. Let the mapping F : FF x Fi^ =1 FF^ with closed graph and a closed 
set S C iR'" be both semi-algebraic. Denote by Fp the set-valued mapping x i->- F{x,p). 
If F is transversal to S, then for all p, with possible exception of a semi-algebraic set of 
dimension smaller than k, Fp is transversal to S. 

Proof. The theorem is trivial if F(x,p) n 5 = 0 for all {x,p), so we assume that F{x,p) 
meets S for some values of the arguments. Then (0, 0) is a regular value of the mapping 
^ FFxFF^xlR^ ^ FF^, ^{x,y,p) = {F{x,p)-y)x{S-y). Let Q = 0). This is 

a semi-algebraic set, so by Theorem 17.361 there is a semi-algebraic set Cq E Ffi^ such that 
dim Co < k and every p E lR^\Co is a regular value of the restriction 'k\q of the projection 
{x,y,p) e^p. 

Take sin.r>N-\-m — k, and let be a C^-Whitney stratification of Graph T 

with all Mi being semi-algebraic manifolds. Then for any i there is a semi-algebraic set 
Ci C Ffi^ such that any p E lR^\Ci is a regular value of 7r\Mi- The union C = IJi=o 
also a semi-algebraic set of dimension smaller than k and, as we have just seen, for any 
p 0 C all of the assumptions of Proposition 17.381 are satisfied for Therefore (0,0) is a 
regular value of ^'p. By Proposition 16.151 this means that Fp is transversal to S. □ 

8 Some applications to analysis and optimization 

In this section we give several examples illustrating the power of regularity theory as a 
working instrument for treating various problems in analysis and optimization. We do 
not try each time to prove the result under the most general assumptions. The purpose is 
rather to demonstrate how regularity considerations help to understand and/or simplify 
the analysis of one or another phenomenon. Again, it should be said that some interesting 
areas of application of metric regularity remain outside the scope of the paper. Just men¬ 
tion the role of regularity in numerical optimization (see e.g. [55lll09llll0| i or connections 
with metric fixed point theory (e.g. H [50] ED Ea ETj) or recent developments associated 
with tilt stability, quadratic growth etc. (e.g. m El ESI Ea Eog ESI ). 
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8.1 Subdifferential calculus 

In each of the three calculus rules stated in Proposition 15.91 we assume one function Lips- 
chitz. One of the reasons (especially important in the proof of the exact sum rule) is that 
Lipschitz functions have bounded subdifferentials. But what happens when both functions 
are not Lipschitz? For instance, what can be said about normal cone to an intersection of 
sets? As in the calculus of convex sub differentials, we do need some qualification conditions 
to ensure the result. 

Theorem 8.1. Let X be a Banach space and Si, i = 1,2 are closed subsets of X. Let 
further x G S = Si Cl 82 - If Si and S 2 are subtransversal at x, then 

Ng{S,x) C + A'g( 52 ,x). 

Explicitly, this theorem was first mentioned in [88] but de facto it was proved already 
in [85| (see also m, Proposition 3). It turns out that subtransversality is the most 
general of all so far available conditions that would guarantee the inclusion. The most 
popular subdifferential transversality condition (condition (b) of Theorem 16.121) may be 
much stronger. 

The inclusion is among the most fundamental facts of the subdifferential calculus: 
enough to mention that in the majority of publications on the subject it is used as the 
starting point for deriving all other calculus rules. Below is a sketch of the proof of the 
theorem for the finite dimensional situation. 

Proof. We need the following elementary and/or well known facts of functions on and sets 
in IR^: 

• N{Q, x) D B = dd{-,Q){x) if x G Q-, 

• if X* G dd{-, Q)(x) and u G Q is the closest to x, then x* G N{Q, u)] 

• if X G Q and /(•) is nonnegative, equal to zero at x and f{u) > d{u,Q) in a 
neighborhood of x, then dd{-,Q){x) C df{x). 

Combining this with the definition of the limiting subdifferential, we conclude that 
for Q, f and x as above, dd{-,Q){x) C df{x) - the fact that is surprisingly missing from 
monographic publications. 

By the assumption there is a AT > 0 such that d{x,S) < K{d{x,Si) + d{x,S 2 )), so 
applying the above to f{x) = K{d{x,Si) + d{x,S 2 )) along with the exact calculus rule 
of Proposition we conclude that dd{-,S)(x) C K{d{-, Si){x) + d{-, Si){x)) and the result 
follows. □ 

8.2 Necessary conditions in constrained optimization. 

We discuss here two ways to apply regularity theory to necessary optimality conditions 
and then a general approach to necessary conditions associated with one of them. Both 
substantially differ from classical proofs that include linearization and separation as the 
major steps (see e.g. [Ml IMl 1103111501 ll52j L Verification of relevance of linearization is 
usually the central and most difficult part of the proofs. It is established under certain 
constraint qualifications which always imply and often are equivalent to regularity of 
the constraint mapping (as in case of the popular Mangasarian-Fromovitz and Slater 
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qualification conditions) (see e.g. |150j where the connection with regularity was made 
explicit). 

We refer to |113l 112811130j for extensions of the classical approach to nondifferentiable 
optimization in which convex separation is replaced by an “extremal principle”. The point 
is however that a fuller use of regularity arguments makes the way to necessary conditions 
much shorter. To begin with we shall consider the problem 

minimize f{x), s.t. F{x) € Q, x G C (8.1) 

(where F : X ^ Y is single-valued and Q G Y and C G X are closed sets) assuming 
for simplicity that both X and Y are finite dimensional although the results have been 
originally proved in much more general situations. 

8.2.1 Non-covering principle. 

So let ic G C be a solution of the problem. Let T stand for the restriction to C of the 
set-valued mapping x {f{x) — M-} x (T(x) — Q) from X into Z = M x Y. Clearly, 
this mapping cannot be regular near (x, {f{x), 0)). (Indeed, if 17 is a small neighborhood 
of X, then T(17) cannot contain points (/(x) — e, 0). It follows that the negation of any 
condition sufficient for regularity is a necessary condition for x to be a local solution in 
the problem. Applying Theorem 16.171 and Corollary 16.181 we get the following result. 

Theorem 8.2. Assume that F : ]R^ — M™' is Lipschitz in a neighborhood ofx. If x is a 
local solution of 118.1\} . then there is a nonzero pair (A,y*) such that A > 0, y* G N{Q,y) 
and 

0G5(A/ + (y*oF|c 7 ))(x). (8.2) 

This formulation needs some comments. We have stated the theorem in finite dimen¬ 
sions for simplicity, its infinite dimensional version can be found e.g. in [HTj. Note further 
that a more customary formulation would be 

0G5(A/ + (y*oT))(x) + iV(C,x). (8.3) 

This condition is usually more convenient (constraints are separated) but in general weaker 
than (j8.2j) . It is equivalent to (18.2h if e.g. C = X (obvious) or if both / and F are 
continuously differentiable and the constraint qualification 

0 G Ffx)y* + Nc{x), y* G Nq{F{x)) ^ y* = 0 (8.4) 

is satisfied (see e.g. |159j . Example 10.8) which means that F\c is transversal to Q at x 
(Proposition I7.38P . 

Finally, we observe that the necessary condition is stated in the Lagrangian form. 
Again, such condition can be substantially more precise than the ’’separated” condition 
0 G Xdf{x) -|- d{y* o F){x) (say in the absence of the constraint x G C) which in various 
forms often appears in literature. Both conditions are equivalent if, say / is continuously 
differentiable. 

The “non-covering” approach to necessary optimality condition was hrst applied prob¬ 
ably by Warga [173| in a fairly classical setting of the standard optimal control problem. 


70 



















Warga refers not to the Lyusternik- Graves theorem but to the result of Yorke |176j which 
is a weakened version of the theorem for integral operators associated with ordinary dif¬ 
ferential equations. But already the same year the controllability - optimality dichotomy 
appeared as the main tool of proving necessary conditions for nonsmooth optimal control 
in the papers by Clarke m and Warga [174] , In the context of an abstract optimization 
problem a non-covering criterion seems to have been first applied by Dmitruk-Milyutin- 
Osmolowski in [45] to problems with finitely many functional constraints and recently, to 
problems with mixed structure (partly smooth and partly close to convex), by Avakov, 
Magaril-Il’yaev and Tikhomirov [9]. In the next subsection 8.3 we demonstrate the work of 
this techniques for an abstract relaxed optimal control problem. Theorem l8.2l in an infinite 
dimensional setting was obtained in [84] with the same proof based on the non-covering 
criterion. 

8.2.2 Exact penalty. 

The immediate predecessor of the approach we are going to discuss here was the idea of 
an “exact penalty” offered by Clarke [351138] : if / attains a local minimum on a closed set 
S' at X G S' and satisfies the Lipschitz condition near x, then x is a point of unconstrained 
minimum of g{x) = /(x) -|- Kd{x, S) with K greater than the Lipschitz constant of / near 
X. Clarke used a fairly sophisticated reduction technique to apply this idea to problems 
with functional constraints. The arguments however are dramatically simplified by direct 
invoking regularity considerations. 

Let us return to the problem (|8.1I1 . assuming as above that F is single-valued Lipschitz 
X = iR”, Y = IR™, and set as in Theorem 16.171 

$(3.) = / 

' 1^ 0, otherwise. 

Then our problem can be reformulated as 

minimize /(x), s.t. 0 G <I>(x). (8.5) 

Suppose that <I> is subregular at (x, 0). This means that there is some Kq > 0 such that 
d(x, $“^(0)) < iLo(i(0, ‘h(x)) for x of a neighborhood of x. But 4>“^(0) is the feasible set of 
our problem, so that there is some other Ki > 0 such that the function f{x)+Kid{^, 4>(x)) 
attains local minimum at x or equivalently, the function /(x) -|- Kid{y, F{x) — Q) attains 
a local minimum at x subject to x G C. The last function is Lipschitz continuous near x, 
hence there is a K such that 

g{x) = /(x) -F K{d{y, F{x) - Q) + d{x, C) (8.6) 

attains an unconditional minimum at x. 

If on the other hand, 4> is nor subregular at x, Theorems 16.11 and 16.171 imply together 
that 0 G d{y* o F)(x) + N(C,x) for some nonzero y* G N(Q,F(x)). From here we easily 
get a weakened version of Theorem 18.21 with the Lagrangian condition replaced by its 
“separated” versions 

0 e df{x) + d{y* o F)ix) + N{C,x), y* e N{Q,F(x)). 
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This is a definite drawback, as we have already mentioned which however is coun¬ 
terbalanced by some serious advantages. First we note that g is defined in terms of the 
original data which makes it possible to study higher order optimality conditions using 
this function. This is how such a techniques was used for the first time in [80] in order 
to get necessary optimality conditions earlier obtained by Levitin-Milyutin-Osmolowski in 

Another advantage is that the second approach is more universal. It can work for 
problems for which using scalarized coderivatives is either difficult or just impossible as 
say, in problems involving inclusions 0 G <h(x) with general set-valued $. This is a typical 
case in optimal control of dynamic systems described by differential inclusions. Loewen 
[125j was the first to use this approach to prove a maximum principle in a free right end 
point problem of that sort. The analytic challenge in his proof was to find an upper 
estimate for the distance to the feasible set. However the next step in the development, 
the ’’optimality alternative” discussed below, excludes even any need in such an estimate. 

8.2.3 Optimality alternative. 

Consider the abstract problem with {X, d) being a complete metric space: 

minimize fix), subject to x G Q C X . 

Theorem 8.3. Let gp be a nonnegative Isc function on X equal to zero atx. IfxGQ is 
a local solution to the problem, then the following alternative holds true: 

• either there is a X > 0 such that the function Xf+^p has an unconstrained local minimum 
at x; 

• or there is a sequence (xn) —>• x such that ^{xn) < n~^dixn,Q) and the function 
X i-G ^p{x) + n~^d{x,Xn) attains a local minimum at Xn for each n. 

We shall speak about regular case if the first option takes place and singular or non¬ 
regular case otherwise. 

Proof. Indeed, either there is an ii > 0 such that Rifix) > dix, Q) for all x of a neighbor¬ 
hood of X, or there is a sequence (zn) converging to x and such that n‘^ip{zn) < d{zn, Q). 
In the first case (as / is Lipschitz) we have for x 0 Q and u G Q close to x (so that e.g. 
dix, u) < 2d{x, Q)\ 

fix) > fiu) - Ldix,u) > fix) - 2LRipix), 
if L is a Lipschitz constant of /. 

As X is complete and (p is lower semicontinuous, we can apply Ekeland’s principle to p 
(taking into account that tpizn) < inf n~‘^dizn, Q)) and find Xn such that dixn, Zn) < 
n~^dizn,Q), pixn) < pizn) and pix) n“^d(x,Xn) > pixn) for x 7 ^ Xn- We have finally 

diXn, Q) > dizn, Q) - diXn, Zn) > (1 “ n~^)diZn,Q) > (1 “ n~^)n'^piZn) > UpiXn) 

as claimed. □ 

Thus, a constrained problem reduces to one or a sequence of unconstrained minimiza¬ 
tion problems. Hopefully, such problems can be easier to analyze thanks to the freedom 
of choosing p which we call test function in the sequel. Even before the alternative was 
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explicitly stated it was de facto used to prove the maximum principle in various problems 
of optimal control [721 [86l 1170] . Here is a brief account of how the alternative works for 
optimal control of systems governed by differential inclusions. 


8.2.4 Optimal control of differential inclusion. 

As the first example of application of the alternative we shall briefly consider the following 
problem of optimal control of a system governed by differential inclusion (see also the next 
subsection 8.3); minimize 

^{x{^),x{T)) (8.7) 

on trajectories of the differential inclusion 

x^F{t,x), (8.8) 

satisfying the end point condition 

{x{0),x{T)) e S. (8.9) 

The natural space to treat the problem is W^’^. Let x(-) be a local solution. For any 
x(-) E set 


V’ixi-)) = [ d{x{t),F{t,x{t)))dt + d{{x{0),x{T)),S). 

Jo 

Clearly, if is nonnegative and ip{x{-)) = 0. Thus, if f is a Lipschitz function, we can apply 
the alternative to get necessary optimality condition. According to the alternative, either 
there is a A > 0 such that x{-) is a local minimum of 

Xi{x{0),x{T)) + d{{x{0),x{T)),S) + [ d{x{t),F{t,x{t)))dt, 

Jo 

or there is a sequence {xn{-)) converging to x{-) such that every Xn{-) is not feasible in 
(I8.7l) - ()8.9p and is a local minimum of the functional 


(i((x(0),x(T)),S') + J d{x{t),F{t,x{t)))dt+ n ^ (^||x(0) - Xn(0)|| + J \\x{t) - Xj 


In both cases we get an (unconstrained) Bolza problem. Analysis of such problem needs 
different techniques and we refer to [86] 1170] where necessary optimality conditions for the 
problem were obtained along these lines. A more general result was established a few years 
later by Clarke [39] (actually the most general for optimal control of differential inclusions 
so far) but a shorter proof of Clarke’s theorem based on optimality alternative is now also 
available [98|. 

To conclude, I wish to note that this is not the only possible application of regularity 
related ideas to optimal control. We can refer to for the discussion of the role of 
metric regularity in the Hamilton-Jacoby theory of optimal control. 
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8.2.5 Constraint qualification. 

The last question we intend to briefly discuss in this subsection concerns constraint qual¬ 
ifications in optimization problems. They often play an important role in proofs but their 
basic function is to guarantee that the multiplier A of the cost function is in the necessary 
(e.g. Lagrangian) optimality conditions is positive. The point is that constraint qualifica¬ 
tions are often connected with regularity properties of the constraint mapping. We shall 
discuss just one example. 

Let us say that the problem is normal at a certain feasible point if the constraint 
mapping is regular at the point. The problem is normal if either the feasible set is empty 
or the problem is normal at every feasible point. In the case of the problem (|8.ip the 
constraint mapping is the restriction of F to C, so by Theorem l6.17l normalitv is guaranteed 
if F is transversal to Q, that is if y* E N{Q,F{x)) and 0 E D*F\c{x, 0)(y*) imply together 
that y* = 0 which in turn imply that 

0£d{y*oF){x) + N{C,x), k y* £ N{Q,F{x)) ^ y* = 0. (8.10) 

This is the now standard constrained qualification in nonsmooth optimization (see e.g. 
[55l 11091113011159] ). If / and F are continuously differentiable and the sets C and Q are 
convex, (|8.10p is dual to Robinson’s constraint qualification [150] . 

8.3 An abstract relcixed optimal control problem. 

Here we apply the optimality alternative to get necessary optimality condition in the 
problem 

minimize f{x) s.t. F{x,u)=0, u £ U. (8.11) 

Here F : X x U Y, X and Y are separable Banach spaces and U is a set. The problem 
is similar to problems with mixed smooth and convex structures studied in [10311165] . But 
contrary to |lfl3] 1165] . here we do not assume that F is continuously differentiable in x. 
We shall formulate the requirements on F a bit later. First we need to introduce and 
discuss some necessary concepts. 

We say that a continuous mapping F : A ^ T is semi-Fredholm at x it has at x a 
strict prederivative of the form (x) = Ax -|- ||/i||(5, where A : X ^ Y is a linear bounded 
operator that send X onto a closed subspace of Y of finite codimension and Q C T is 
a compact set (that can be assumed convex and symmetric). We say furthermore that 
5* C A is finite-dimensionally generated if 5* = A“^(P) where A : A — ?■ is a continuous 

linear operator and P C FF is closed. 

Proposition 8.4 (non-covering principle for (18.lip [M] 172]). Let F : X ^ Y be semi- 
Fredholm at X, and let S be a finite-dimensionally generated subset of A. Let further 
F |5 be the restriction of F to S, that is the set-valued mapping equal to {T(x)} on S 
and 0 outside of S. If F\s is not regular near x, then there is a y* t) such that 0 E 
dciv* ° F)^) + -^g( 5, x). Moreover, the weak*-closure of the set of such y* with norm 1 
does not contain ^ercd. 

®More general versions of this result can be found in many publications related to “point estimates” 
and compactness properties of subdifferentials - see e.g |8511105111061110811130| 
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We intend to use this principle to prove the following theorem. 


Theorem 8.5. Let {x,u) be a solution of 118.11\) . We assume that 
(Ai) f satisfies the Lipschitz condition in a neighborhood ofx; 

(A 2 ) for any u G U the mapping F{-,u) is Lipschitz in a neighborhood ofx, and F{-,u) 
is semi-Fredholm at x; 

(A 3 J F(x,U) is a convex set for any x of a neighborhood ofx; 

(A^) S is finite-dimensionally generated 

Let further C{\,y*, x,u) = Xf{x) + {y*, F{x,u)) be the Lagrangian of the problem. 
Then there are A > 0 and y* E Y* such that the following relations hold true: 

^ + lly*ll > 0 (non-triviality)-, 

0 E dGT.{\,y*, ■,u)ix) + Nq{S,x) (Euler-Lagrange inclusion)-, 

{y*,Fix,u)) > {y* ,F{x,u)), V u gU (the maximum principle). 

Proof. Given a finite collection U = (mi, ... ,Uk) of elements of U, we define a mapping 
■ X X hy 

k 

^uix,ai, ...,ak)= F{x,u) + '^ai{F{x,Ui) - F{x,u)). 

i=l 

It is an easy matter to see that this mapping is also semi-Fredholm at fx, 0). 

Consider the problem 

minimize f{x) s.t. ^u{x, ai,..., a^) = 0, x G S, ai > 0. {Pu) 

Then (T, 0, ...,0) solves the problem (as immediately follows from (A 3 )). Let further 
T : X X —)• y be defined by 

T(x,Q!o, ...,ak) = (fix) + ao,^u{x,ai,.. .,ak)). 

This mapping cannot be regular in a neighborhood of (x, 0,..., 0) because no point {fix) — 
e, 0,... , 0) can be a value of T at x E S' close to x and a close to zero. It is an easy matter 
to verify that T is also semi-Fredholm at (x, 0,..., 0) and we can apply Proposition 18.41 
Set S = S X C{X,y*,x,aQ, ...,ak) = A(/(x) -b ao) + {y*,^{x,ao, ■ ■ .,ak)). By 

the proposition there are multipliers {X,y*) 0 such that 

0 E 5gT(A, ^(T, 0,..., 0) + iVG(S, (x, 0,..., 0)). 

We have (using the standard rules of subdifferential calculus ) 

Ag(S, (x, 0,..., 0)) = Ag(x, S) X 
5G^(A,y*,-)(^,0, ...,0) C dGC{X,y*,-,u){x) 

+ (A, {y*,F{x,ui) - F{x,u)),..., {y*,F{x,Ui) - F{x,u))). 

It follows that there are ^i<0, i = 0,... ,k such that 

0 E dGC{X,y*,-,u){x) + Ng{S,x)-, 

X = -fo> 0 ; 

{y*,F{x,Ui) - F{x,u)) =^i>0, i = l,...,k. 


75 




The relations remain obviously valid if we replace A, y* by rA, ry* with some positive 
r. Thus for any finite collection (ui,..., u^) G U we can find a pair of multipliers (A, y*) 
satisfying the three above mentioned relations and the normalization condition A + ||7/*|| = 
1. Let ,Ufc) be the weak*-closure of all such pairs. Then n(ni, ... ,Uk) is weak*- 

compact and by Proposition 18.41 does not contain zero. It remains to notice that the 
increase of the set (rti ,... ,Uk) may result only in decrease of n(ni ,... ,Uk) and therefore 
there is a nonzero pair A, y* common to all sets ^(ni,..., Uk)- □ 

8.4 Genericity in tame optimization. 

Here by “tame optimization” we mean optimization problems with semi-algebraic data. 
We consider the same class of problems as in (j8.1|) . This time however we are interested 
in the effects of perturbations and shall work with a family of problems depending on a 
parameter p: 

minimize /(x,p), s.t. F{x,p) G Q, x G C. (8-12) 

Here x is an argument in the problem and p is a parameter. So subdifferentials and 
derivatives that will appear below are always with respect to x alone. If p is fixed, then 
we denote the corresponding problem by Vp. 

Before we continue, we have to mention that for a semi-algebraic set S C IRF the 
properties 

• S' is a set of first Baire category in IRT", 

• S has n-dimensional Lebesgue measure zero; 

• dimS < n 

are equivalent. Thus, when we deal with semi-algebraic objects e.g. in the word 
“generic” means ”up to a semi-algebraic set of dimension smaller than k.” 

We shall assume that p is taken from an open set P C IR^ and, as before, x G SP and 
F takes values in IR™'. Our main assumption is that 

the restriction F\c{x,p) of F to C is transversal to Q. 

This is definitely the case when k = m and F{x,p) = F{x) —p. As to F itself, we assume 
that it is continuous with respect to (x,p) and locally Lipschitz in x. The sets C and Q 
as usual are assumed closed. 

Theorem 8.6 (generic normality). Under the stated assumptions for a generic p G P, 
the mapping F\c{-,p) is transversal to Q. Thus for a generic p the problem Vp is normal. 

Proof. The first statement is immediate from Theorem 17.391 while the second from the 
comments following the statement of Theorem 16.171 □ 

Let us call a point x feasible in Vp a critical point of the problem if the non-degenerate 
Lagrangian necessary condition of 8.2.1 

0 G 5(/ + {y* o F\c)){x,p), y* G N{Q, F{x,p)) 

is satisfied. In this case the value of / at x is called a critical value of Vp. 
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Theorem 8.7 (generic finiteness of critical valnes). If under the stated assumptions, Vp 
is normal, then the problem may have only finitely many eritical values. Thus there is an 
integer N such that for a generic p the number of critical values in the problem does not 
exceed N. 

Proof. Consider the function 

^p{x,y,y*) = f{x,p) + {y*,F\c{x,p) - y) + iQ{y). 

As follows from the standard calculus rules, 

dCp{x, y, y*) = d{f + y*o F\c)ix,p) x {N{Q, y) - y*) x {F{x,p) - y]. 

Thus, (x, y, y*) is a critical point of Cp if and only if F{x,p) = y, 0 G y) — y*, that 
IS y G Q and y* G N{Q,y), and 0 G d{f + y* o F\c){x,p). In other words, {x,y,y*) is a 
critical point of Cp if and only if x is a feasible point in (P), y = F{x,p) and the necessary 
optimality condition is satisfied at x with y* being the Lagrange multiplier. We also see 
that in this case Cp{x,y,y*) = f{x,p). In other words, critical values of the problem are 
precisely critical values of C. 

By the Sard theorem Cp may have at most finitely many critical values, whence the 
theorem. □ 

The last result we are going to present here has been so far proved only under some 
additional assumptions on elements of the problem. We shall explain it for the classical 
case, although semi-algebraic nature of the data remains crucial. 

Theorem 8.8 (generic finiteness of critical points). Assume that p = {q,y) with q G IBF 
and y G f{x,p) = /(x) — {q,x), F{x,p) = F{x) — y with /(x) and F{x) both 

continuously differentiable Assume further that the sets C and Q are closed and convex. 
Then there is an integer N such that for a generic p the number of pairs {x,y*), such that 
X is a critical point in Vp and y* a corresponding Lagrange multiplier does not exceed N. 

The theorem follows from the two results below that contain valuable information 
about geometry of sub differential mappings of semi-algebraic functions. 

Proposition 8.9 (dimension of the subdifferential graph [58] )• The dimension of the 
graph of the subdifferential (no matter which, Frechet, limiting or Clarke) mapping of a 
semi-algebraic function on is n. 

Proposition 8.10 (finiteness of preimage [931158] 1. Let F : ^ be a semi-algebraic 

set-valued mapping such that dim(Graph F) < n. If y is a regular value of F, then 
F~^{y) contains at most finitely many elements. Thus, there is an integer N such that 
for a generic y the number of elements in F~^{y) cannot exceed N. 

To see how the propositions lead to the proof of the theorem, we note first that 
D*F\c{x){y*) = F'{x)y* -\- Nc{x) if x G G, F is smooth and C convex. By Theorem 
16.151 F\n is transversal to Q if and only if 

xGC, F{x) GQ + y,QG F'{x)y* + Nc{x), y* G N{Q, F(x) - y) ^ y* = 0, (8.13) 
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and by Theorem 17.391 this holds for a generic y. 
Consider the function 


9{x, y) = f{x) + ic{x) + iQ{F{x) - y) 

By Proposition 18.91 the dimension of the graph of its subdifferential is n + m. Then so is 
the dimension of the graph of the mapping 

T{x,y*) = {{q,y) ■. {q,y*) ^ dg{x,y)]. 

Now by the Sard theorem generic ((?,y) is a regular value of T so iProposition 18.lOT) for a 
generic (g, ?/) there are finitely many {x,y*) such that {q,y) G r(x,y*). Finally, if for such 
{q, y) the qualification condition (|8.13l) is satished, then 

dg{x, y) = {{q, y*) : f{x) + {y* o F{-))'{x) + N{C, x), y* G N{Q, F{x) - y)] 

(even if Q is not convex - see again Exercise 10.8 in |159] 1 which in particular means that 
X is a critical point of Vp and y* is a Lagrange multiplier in the problem. 

8.5 Method of alternating projection. 

This is one of the most popular methods to solve feasibility problem due to its simplicity 
and efficiency. The feasibility problem in its simplest form consists in finding a common 
point of two sets, say Q and S. The recipe offered by the method of alternating projection 
is the following: starting with a certain xq, we choose for /c = 0,1,... 

X2k+1 G 7rQ(x2fe), X2k+2 G 'ns{x2k+l)-i 

where vrQ(x) is the collection of points of Q closest to x etc.. 

Von Neumann was the first to show in mid-30s (see |172] ) that in case of two subspaces 
the method converges to a certain point in the intersection of two closed subspaces in a 
Hilbert space (depending of course on the starting point). Later in the 60s Bregman [28] 
and Gubin-Polyak-Raik m applied it to convex subsets in . In particular it was shown 
in [75| that the convergence is linear if relative interiors of the sets meet. Later Bauschke 
and Borwein m proved linear convergence if the sets are subtransversal at any common 
point. 

But in computational practice the method was successfully applied even for nonconvex 
sets. The first explanation was given by Lewis, Luke and Malik |122j : if at a certain point 
X in the intersection the sets are transversal and at least one of the sets is not “too 
non-convex” in a certain sense (super-regular in the terminology of the authors) then 
linear convergence of alternating projections to a certain point common to the sets (not 
necessarily x) if the starting point is sufficiently close to x. And very recently it was 
shown by Druzviatskyj, Ioffe and Lewis [57] that transversality alone guarantees linear 
convergence. In fact linear convergence was proved in m under a substantially weaker 
condition of “intrinsic transversality” of the sets, but we believe that geometric essence of 
the phenomenon is captured by the transversality => linear convergence implication. The 
question whether linear convergence is guaranteed by subtransversality, as in the convex 
case, remains open (see [77]). 
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Here is a short proof of linear convergence under the transversality assumption. Set 

ip{x,y) = iQ{x) + is{y) + ||x - y\\. 

We claim that if Q and S are transversal at x G Q H 5, then there are k > 0 and <5 > 0 
such that for any x £ Q, y £ S close to x 

max{\Vip{-,y)\{x),\Vip{x,-)\{y)} > k. 

To this end, we first note that by Theorem 16.121 

0 = sup{(u,u) : u £ N{Q,x), v £ —N{S,x), ||u|| = ||u|| = 1} < 1. 

Fix a certain k £ (0,1) and assume that there are sequences (xn) C Q, (yn) C S, 
Xn yn, converging to x and such that 

\Vip{-,yn)\{Xn) < K, \Vip{Xnr)\{yn) < K, 


that is the functions 

X\-^ ip{x,yn) + k\\x - Xn\\ and y ^ y^{Xn,y) + l^\\y - yn\\ 
attain local minima respectively at Xn and This means that 

n ^ * I ~ yn , ^ . yn ~ Xn 

0 G t(;„ + 71-n- + 0 £ Zn + n-r? + 


\^n yn\ 


\Xn 2/n|| 


(8.14) 


(8.15) 


for some rc* G N{Q,Xn) and z* G N{S,yn)- Thus, for any limit point {w*,z*) of {'w^,z* 
we have 

w* = e + a, z* = —e + b, 
where ||e|| = 1, ||a|| < k, ||6|| < k. Consequently 


and we get 


^ ^ (e + g, e + 5) ^ (1 - k)^ 
~ ||e + a||||e + 6|| “ (1 + k)^ 

^i-Ve 

K > 


(8.16) 


i + Ve' 

This proves the claim. 

Then TTg^y) = argmin </:>(•, y) and the method of alternating projections can be written 
as follows; 

x„+i G argmin tp{xn, ■)] Xn +2 G argmin ^p(■,Xn+l)■ 

We obviously have |V93(xn, •)!(^^n+i)! = 0. For a given x (not necessarily in Q), consider 
the function tpxiy) = isiv) + ||a^ ~ y\\- For any c G (0,1) condition iVipxKxn+i) < c 
obviously holds if 


{x - Xn+l,Xn - Xn+l) > VW^||x - Xn+l||||Xn - Xn+l||. 


79 










Take a c < k, and let Kc be the collection of c satisfying the above inequality. This is an 
ice-cream cone with vertex at Xn+i- x G Q (1 Kc, then Vip{-,Xn + l)(a;) > n > c. On 
the other hand, as is easy to see, the distance from Xn to the boundary of Kc is precisely 
cr, where r = ||xn — -|- 1||. Applying Basic lemma for error bounds iLemma 17.ip . we 

conclude that there is an x G Q with ^p{x, Xn+i) < ^{xn-,Xn+i) — CK\\xn+i ~ ^nW- It follows 
that 

113^71-1-2 ^n+1 II — ¥^(^n+2 , ^n+1 ) ^ (1 C ) || Xn+l || 

which is linear convergence of (xn)- 

8.6 Generalized equations. 

By a generalized equation we mean the relation 

0 € f{x) + F{x), 

where / is a single-valued and F : X ^ Y a set-valued mapping. Variational inequali¬ 
ties and necessary optimality conditions in constraint optimization with smooth cost and 
constraint functions are typical examples. The problem discussed in the theorem below 
is what happens with the set of solutions of the generalized equation if the single-valued 
term is slightly perturbed. 

Theorem 8.11 (implicit function for generalized equations). Let X, Y be metric spaces, 
and let Z he a normed space. Consider the generalized equation 

Q G f{x,p) + F{x), (8.17) 

where f : X x P ^ Z and F : X ^ Z. Let {x,p) be a solution to the equation. Set 
z = —f{x,p) and suppose that the following two properties hold: 

(a) Either X or the graph of F is complete in the product metric and F is regular near 
(x,T) with surF(x|z) > r; 

O O 

(b) there is a p > 0 such that f is continuous on B{x,p) x B{p,p) and f{-,p) satisfies 

O O 

on B(x,p) the Lipschitz condition with constant i < r for all p G B{p,p). 

Let S{p) stand for the solution mapping of |<g.IT] ). Then 

d{x,S{p')) < (r-^)“^||/(x,p)-/(x,p')ll- 

if x G S{p) is close to x and p,p' are sufficiently close top. Thus, if f{x,-) satisfies the 

O 

Lipschitz condition with constant a on a neighborhood of p for all x G Bix,p), then S{-) 
has the Aubin property near {p,x) with lip5(p|x) < a{r — L)~^. 

Finally, if in addition F is strongly regular near {x,z), then Sf) has a Lipschitz 
localization s(-) at {x,y) with Lipschitz constant not greater than a{r — L)~^, so that 

d{s{p),.s{p')) < (r - i)~^\\fis{p),p) - f{s{p),p')\\ <a{r-£)-'^d{p,p'). 

Note that in view of Theorem 13.101 condition (a) is equivalent to the assumption that 
there are r > 0 and ^ > 0 such that |V^(^^|(x,x) > r (where ipz{x,v) = d{z,v) + 
^Graph f{x,v)) if e.g. d{z,p) < p, || 2 ;|| < p and z v G F{x). 
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Proof. Set G{x,p) = f{x,p) + F{x) and let H{p,z) = {G{-,p)) ^( 2 ;), so that S{p) = 
H{p,0). As the Lipschitz constants of functions f{-,p) are bounded by the same £ for 

O _ 

all p G B[p, p) , it follows from Theorem 14.51 that there is a <5 > 0 such that for every 

O 

p G B{p,p) the inequality d{x,H{p,z)) < (r — £)~^d{z,G{x,p)) holds if d{x,x) < 6 and 
\\z — z{p)\\ < 6, where z{p) = f{x,p) — f{x,p) G G{x,p). As / is continuous, we can 

O O 

choose A > 0 such that ||2;(p)|| < 5 for p G B{p,X). For such p we have 0 G B{z{p),5) 
and therefore if d{p,p') < A, we get, taking into account that 0 G f{x,p) + F{x) by the 
assumption, 

d{x,S{p')) < {r — i)~^d{0,G{x,p')) = (r — £)~^d{0, f{x,p') + F{x)) 

= {r - £)-^dl-f{x,p'),F{x)) < (r - £)-^\\f {x,p') - f{x,p)\\ 

This proves the first part of the theorem. The second now follows from Theorem 

imi □ 

The concept of generalized equation was introduced by Robinson in [153] . The theorem 
proved in [15311154] corresponded to / continuously differentiable in x and F being either 
a maximal monotone operator or F(x) = N{G,x), where C is a closed convex set. We 
refer to m for further results and bibliographic comments on generalized equations which 
is one of the central objects of interest in the monograph. 

An earlier version of part (a) of the theorem with a less precise estimate can be found 
in [109] (Theorem 4.9). Part (b) of the theorem relating to strong regularity is the basic 
statement of Theorem 5F.4 of [55] (generalizing the earlier results of Robinson in [154[I155] : 
see also jUj for an earlier result). Our proof however is different: here the theorem appears 
as a direct consequence of Milyutin’s perturbation theorem. Note that in most of the 
related results in |55j it is assumed (following |155j l that there exists a “strict estimator 
h{x) for / of modulus £” such that sur(T + h)(x\y + h{x)) > r. This is a fairly convenient 
device for practical purpose but it adds no generality to the result as the case with h 
reduces to the setting of the theorem if we replace F + h by F and f — hhy f. 

8.7 Variational inequalities over polyhedral sets. 

Variational inequality is a relation of the form 

0 G (/?(x) + V(C, x), (8.18) 

where p : iR” —?■ iR” is a single-valued mapping and C C iR” is a convex set. If C is a 
cone, it is equivalent to 


xGK, F{x) G K°, {x, F{x)) = 0. 

The problem of finding such an x is known as a complementarity problem (see e.g. m)- 
Problems of this kind typically appear in nonlinear programming in connection with nec¬ 
essary optimality conditions. 

Consider for instance the problem 

minimize fo{x) s.t fi{x)<0, i = l,...,k, fi{x)<0, i = k + l,...,m. (8.19) 
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with fo,..., fm twice continuously differentiable. If x is a solution of the problem, then 
(assuming that the problem is normal and setting / = (/i,..., fm)) there is a y G IR''" 
such that 


Vfo{x) + {y,Vf{x)) = 0. 


Setting 


^{x,y) 


V/o(x) + (y,V/(x)), 

fix) 


C = lR^x 


we see that (x,y) solves (jS.lSI) (with x replaced by {x,y)). 

Consider the set valued mapping 'I'(x) = ^p{x) + N{C,x) associated with (jS.lSp as¬ 
suming that C is a convex polyhedral set. What can be said about regularity of such 
mapping near a certain (x, y) G Graph <h? Applying Milyutin’s perturbation theorem 
(Theorem 14.51) and Theorem 14.111 and taking into account that the Lipschitz constant of 
h — ip{x + h) — 4>'{x)h at zero is zero, we immediately get 

Proposition 8.12. Let y G T(x) for some x G C. Set A = (p'ix) and T(x) = Ax + 
N{C — X, x). Then T is (strongly) regular near (x,y) if and only if ^ is (strongly) regular 
near (0,0) and sur'I'(x|y) = surT(0|0). 

In other words, the regularity properties of T are the same as of its “linearization” T. 
Therefore in what follows we can deal only with the linear variational inequality 


0 G Ax + N{C, x) 


( 8 . 20 ) 


and the associated mapping 

<I>(x) = Ax + N{C, x). 

The key role in our analysis is played by the concept of a face of a polyhedral set C 
which is any closed subset T of C such that any segment A C C containing a point x G F 
in its interior lies in T. A face of C proper if it is different from C. We refer to [158] for all 
necessary information about faces. The following facts are important for our discussion: 

• the set Fc of all faces of C is finite; 

• F G Fc if and only if there is a y G IRF such that F = {x G C •. {y,x) > {y,u), \/ u G 

C}; 

• if F,F' G Fc and Tn riT' 7^ 0, then F' G F; a proper face of C lies in the relative 
boundary of C; 

• if F G Fc and xi, X2 belong to the relative interior of F, then T{C,xi) = T(C', X2) 
and N{C, xi) = N{C, X 2 ). 

The last property allows to speak about the tangent and normal cones to C at T which 
we shall denote by T{C,F) and N{C,F). It is an easy matter to see that 

dimF-I-dim ^((7, T) = n; dim{F + N{C, F)) = n. (8.21) 

For any x G C denote by Tmin(2:) the minimal element of Fc containing x. The is 
straightforward 

X G F G Fc, Sz F = Tmin(aj) x G viF. (8.22) 
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Proposition 8.13. If ^ is regular near {x,y) and F = Fmin (x), then 

dim{ A{F) + N{C,F)) = n. 

In particular, A is one-to-one on F. 

Proof. If dimP = 0, then x is an extreme point of C in which case T{C,x) is a convex 
cone containing no lines and its polar therefore has nonempty interior. On the other hand, 
if X E int C, then N{C, u) = {0} for all n of a neighborhood of x and ^{u) = Au for such 
u. So by regularity A is an isomorphism. 

Thus in the sequel we may assume that the dimensions of both F and N{C,F) are 
positive. By changing (x, y) slightly, we can guarantee that y belong to the relative 
interior of N(C, F). Let e > 0 be so small that the distances from x and y to the relative 
boundaries of F and N{C, F) are greater than e. Then any [u, v) such that u G C, 
V E N{C,u), ||u — x|| < e, ||x — y|| < e must belong to F x N{C,F). This means that 
<h(B(x,e)) n B{y,e) C A{F) + N{C,F) and the result follows from (I8.2ip . Indeed, the 
dimension equality is immediate from the last inclusion. On the other hand, if A is not 
one-to one on F, then dimA(F) < dimF and by (18.2111 dim^(F) + dimA^(C', F) < n. □ 

Let C C IRF be a convex polyhedron, and let F be a proper face of C. Let L be the 
linear subspace spanned by F and M the linear subspace spanned by N{C,F). These 
subspaces are complementary by (j8.21l) and orthogonal. By Proposition 18.13] AiV] and M 
are also complementary subspaces if is regular near any point of the graph. 

Let -Km be the projection onto M parallel to A{L), so that TrMiA{F)) = 0. Set 
Km = {T{C,F)) n M, and let Am be the restriction of ttm o ^4 to M. Then Km is a 
convex polyhedral cone in M and its polar K’fj (in M) coincides with N{C,F). 

Definition 8.14. The set-valued mapping ^m{x) = Amx N{Km,x) viewed as a map¬ 
ping from M into M will be called factorization of along F. 

Observe that the graph of a factorization mapping is a union of convex polyhedral 
cones. 

Proposition 8.15. //$ is regular near {x,Ax) for some x G C, then the factorization of 
along F = Fmin(x) is globally regular on . 

Proof. Set Ki = T{C,F) = T{C,x) and consider the mapping ‘hi(x) = Ax -|- N{Ki,x). 
By Proposition 17.241 ‘hi(x) = <h(x -|- x) — Ax for x close to zero. Therefore 4>i is regular 
near (0,0), hence globally regular by Proposition 17.241 Observe that Ki = Km + L and 
K° = N{K,F) and consequently N{Ki,x) C N{K,x) = N{K,F) for any x E Fi. 

As <I>i is globally regular, there is a p > 0 such that d(x, <hj"^( 2 ;)) < pd{z,^i{x)) for 
all x,z G F?". Take now x,z G M. We have (taking into account that N{Km,x) = 
N{Ki,x -|- 0 ^ -b and Amx = A{x -|- ^) for some f G L) 

d{z,^M{x)) = m.i{\\z - Amx - y\\ ■. y G N{Km,x)} 

> inf{||z - A(x -FO -y\\- L, y G N{Ki,x-\-^)} 

= inf d{z,^I {x-\-1,)) = d{z,^I (w)) 
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for some w G x + L. On the other hand, there is a such that z G 

and lire — w'\\ = d(r(;, <h^^(z)). Let x' be the orthogonal projection of w' to M. We have 
z = Aw'+y for some y G N{Ki,w') C M. Therefore Aw' G M and moreover Amx' = Aw'. 
The latter is a consequence of the following simple observation: 

V = Aw G M, X G M, X T (re — x) => Amx = v. (8.23) 

Indeed, z = w — x & L, hence Ax = Aw + Az = v + Az and, as u G M and Az G A{L) we 
have TTMiAx) = v + t:m{Az) = v. 

It follows, as N{Km,x') = N{Ki,w')), that 2 G ^Mix') and 

d{x, ^J^{z)) < ||x — x'll < lire — re'll = d{w, < pd{z, $ 1 ( 7 /;)) < d{x, ^m{x)), 

that is is regular on M (with the rate of metric regularity not greater than p). □ 

The following theorem is the key observation that paves way for proofs of the main 
result. 

Theorem 8.16. Let C = K be a convex polyhedral cone. //$ is regular near (0,0) (hence 
globally regular by Provosition \5. (j\) . then A{K) n K° = {0}. 

Proof. The result is trivial if n = 1. Assume that it holds for n = m — 1, and let m = n. 
Note that the inclusion A{K) C K° can hold only if iL = {0}. Indeed, if the inclusion is 
valid, then <h(x) G A{K)+K° = K° for any x G iL, so by regularity K° must coincide with 
the whole of IRP and hence K = {0}. Thus if there is a nonzero u G A{K) n K°, we can 
harmlessly assume that rt is a boundary point of K° and there is a nonzero w G N(K°,u). 
Then w £ K and u G N{K,w). Let F = Fi^i^{w) so that u G N{K,F). Let as before, 
L be the linear subspace spanned by F and M the linear subspace spanned by N{K,F). 
These subspaces are complementary by (|8.2ip and orthogonal. By Proposition 18.131 AIL) 
and M are also complementary subspaces. Clearly, u does not belong either to L or to 
A{L), the latter because otherwise the dimension of A{F) + N{K,F) would be strictly 
smaller than n. 

Consider the factorization of $ along F. Then u G by definition. But as 
follows from (|8.23p u also belongs to Am (Km)- As is regular by Proposition 18.151 and 
dimM < m, the existence of such a u contradicts to the induction hypothesis. 

□ 


We are ready to state and proof the main result of the subsection. 

Theorem 8.17 (regularity implies strong regularity). Let C be a polyhedral set and <h(x) = 
Ax + N{C,x). If ^ is globally regular then the inverse mapping <I>“^ is single-valued and 
Lipschitz on JRA. Thus, global regularity o/<h implies global strong regularity. 

In other words, the solution map of y G <I>(x) is everywhere single-valued and Lipschitz. 

Proof. We only need to show that is single-valued: the Lipschitz property will then 
automatically follow from regularity. The theorem is trivially valid if n = 1. Suppose it is 
true for n < m — 1 and consider the case n = m. We have to show that, given a convex 
polyhedron C G fR™ and a linear operator A in IHI" such that <I>(x) = Ax -|- N(C,x) is 
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globally regular on i??”, the equality Ax + y = Au + z for some x,u G C, y & N{C,x), 
z G N{C, u) can hold only if a: = u and y = z. 

Step 1. To begin with we observe that the equality Au = Ax + y for some u,x G C and 
y G N{C, x) may hold only iiu = x. Indeed, u — xG T{C, x). The same argument as in the 
proof of Proposition 18.151 shows that = Aw + N{T{C, x),w) is also globally regular 

and therefore by Theorem 18.161 A(T(C. x)) n N{C,x) = {0}. It follows that A(u — x) = 
y = 0. But regularity of <hi implies (by Proposition 18.l.Sp that A is one-to one on T{C,x), 
hence u = x. 

Step 2. Assume now that for some x,u G C, u ^ x, the equality Ax + y = Au + z, or 
A{u—x) = y—z^ holds with y G N{C, x), z G N{C, u). We first show that this is impossible 
if X G T’jnin(ri). If under this condition x G ri C, then u is also in riC which means that 
N{C,x) = N{C,u) coincides with the orthogonal complement E to the subspace spanned 
by C —C. We have y — z G E and u — x G C — C. By Proposition l8.13l Afn —xi = y —z = 0 
and the second part of the proposition implies that u = x. 

Let now E = F^i^(x) be a proper face of C. Then E C (u) and therefore z G 
N{C,F). Denote as before by L the subspace spanned by F and by M the subspace 
spanned by N{C,F), and let be the factorization of <I> along F. Set v = A{u — x) = 
y — z. Then v G M as both y and z are in N{C,F). Let w be the orthogonal projection 
of tt — X onto M. Then by ()8.23p Aw = v and therefore Amw = v. 

Thus (recall that y,z G M) 

Amw + z = {ttm o A){u - x) + z = ■km{A{u - x) + z) = TTMy = y- 

On the other hand, it is clear that y G N{Km,0) and G N{Km,w). Indeed, z G 

N{T{C,x),u — x) (since {z,v — x) < {z,u — x) for all u G C on the one hand and, as 
we have seen, z G N{C,x), on the other) and therefore z G N{Km,w) as z G M and 

w — {u — x) G L. As dimM < m, we conclude by the induction hypotheses that re = 0, 

hence u — x G L. But A{u — x) = y — z G M and a reference to proposition 18.131 again 
proves that u = x. 

Step 3. It remains to consider the case when neither x nor u belongs to the minimal face 
of the other. Let k be the modulus of metric regularity of or any bigger number. Choose 
e > 0 so small that the ball of radius (1 + k)s around x does not meet any face F G Tc 
not containing x. This means that x G Fmin{w) whenever w G C and ||rc — x|| < (1 + k)£. 
Let further N be an integer big enough to guarantee that 5 = A"“^||y|| < e. Regularity of 

allows to construct recursively a finite sequence of pairs {uk,Zk), k = 0,1,... ,m such 
that 

{uo,zo) = {u,z), Zk G Fraax{uk), Uk + Zk = X + {I - m~^k.)y, \\uk - Uk-l\\ < k5. 

Then un + zn = x. As follows from the result obtained at the first step of the proof, 
this means that un = x. This in turn implies, as uq ^ x, that for a certain k we have 
Uk / X, ||ufc — x|| < k6 < K£. By the choice of e this implies that x G Fmin{uk)- But 
in this case the result obtained at the second step excludes the possibility of the equality 
Uk + Zk = X + {1 — m~^k)y unless Uk = x. So we again get a contradiction that completes 
the proof. □ 
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The material presented in this subsection is a part of my recent paper [99] which 
contains also a proof (based on a similar ideas) of another principal result concerning 
uniqueness and Lipschitz behavior of solutions to variational inequalities over polyhedral 
sets due to Robinson mi- Theorem 18.171 was hrst stated by Dontchev-Rockafellar [5l| 
with a comment that it follows from a comparison of the mentioned Robinson’s result and 
another theorem (proved by Eaves and Rothblum [62|) containing an openness criterion 
for piecewise affine mappings. The given proof seems to give the first self-contained and 
reasonably short justification for the result. We refer to [55] |65] for further details. 

8.8 Differential inclusions — existence of solutions. 

Here we consider the Cauchy problem for differential inclusions: 

x^F{t,x), x{0) = xq, (8.24) 

where F : JR x ^ FF. We assume that 

• E is defined on some A xU (that is F{t, x) 7 ^ 0 for all x G [/ and almost all t G A), 
where A = [0,r] and U is an open subset of iR” containing xq; 

• the graph of F{t, ■) is closed for almost every t G A; 

• E is measurable in t in the sense that the function t 1 —)• (i((x, y), Graph F{t,-)) is 
measurable for all pairs (x,y) G 3F x IR^. 

By a solution of (18.241) on [0, r] C [0, A] we mean any absolutely continuous x{t) defined 
on [0,r] and such that x{t) G F(t,x{t)) almost everywhere on [0,r]. 

Theorem 8.18. Assume that there is a summable k{t) such that 

h{F{t,x),F{t,x')) < k{t)\\x — x'W, \/x,x'gU, a.e. on [0,1]. (8.25) 

Let further xq{-) he an absolutely continuous function on [0,r] with values in U such that 
xo(0) = xo and p{t) = d{xo{t), F{t,xo{t))) is a summable function. 

Then there is a solution of |Eg.^[ ) defined on some [0, r], r > 0. Specifically, set 
r = d{xo, 1R'^\U), and let r G (0,r] be so small that 

PT PT 

1 > kr = / k{t)dt-, {1 — kr)r > = / d{xo{t), F{t,xo{t)))dt. (8.26) 

Jo Jo 

Then for any s > 0 there is a solution x(-) of i8.24\) defined on [0, r] and satisfying 

||x(t) -xo(t)|| < (8.27) 

Recall that h{P,Q) is the Hausdorff distance between P and Q. 

Proof. We may set xo(t) = 0 (replacing if necessary F{t,x) by F{t,xo(t) + x) — x(t) and 

U by rB). Let X = VEq’ [0,t] stand for the space of fR"'-valued absolutely continuous 
functions on [ 0 , r] equal to zero at zero with the norm 

lk(-)l|r = / \\x{t)\\dt, 

Jo 
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and let I denote the identity map in X. Let finally J- be the set-valued mapping from X 
into itself that associates with every x(-) the collection of absolutely continuous functions 
y(-) such that y(0) = 0 and y{t) G F{t,x{t)) a.e.. We have to prove the existence of an 
x(-) G X satisfying ()8.27l) and 

0G(/-^)(x(-)) (8.28) 

Note first that the graph of X is closed, that is whenever Xn(-) ^ x(-), ?/n(‘) £ ^{xn{-)) 
and yn{ ) norm converge to ?/(•), then y{-) G J^(x(-)). Let U be the open ball of radius r 
around zero in X. Thus x{t) G U for any t G [0,r] whenever x(-) G U and therefore by 
(j8.25jl X is Lipschitz on Li with lipJ-'(ZY) < kr- On the other hand, I is Milyutin regular 
on l/{ with surmL(ZY) = 1. By Theorem 14.21 

SUIm{I — > 1 — kr- (8.29) 

In particular B{y{-), (1 — kr)p) C (/ — X){pB) for any y(-) G (/ — T')(0) if p < r. Take 
a y{-) G X such that y{t) G X(t,0) and ||y(t)|| = d(0, X(t, 0)) a.e.. Then ||?/(-)||r = Cr < 
(1 — kr)r by (18.2611 . Thus 0 G B{y{-), (1 — kr)p) for some p < r and therefore there is an 
x(-) with ||x(-)||t- < p, 0 G (/ — T')(x(-)). □ 

The theorem is close to the original result of Filippov |66j . Versions of this results 
and its applications can be found in many subsequent publications, see e.g mi- Typical 
proofs of existence results for differential inclusions use either some iteration procedures or 
selection theorems to reduce the problem to existence of solutions of differential equations. 
Observe that our proof appeals to non-local regularity theory. 
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